t> Plea 



i a plus sign (+) inside this box n 



— %i - 

\r\ BEST AVAILABLE COPY ™*i aw? 3 ^ 



PTO/SB/21 (6-99J 
Approved for use through 09/30/2000. OMB 0651-0031 
Patent and Trademark Office: U.S. DEPARTMENT OF COMMERCE 



^ TRANSMITTAL 

FORM 

(to be used for all correspondence after initial filing) 


Application Number 


10/006,063 


Filing Date 


December 6, 2001 


First Named Inventor 


Kevin P. Baker 


Group/Art Unit 


1647 


Examiner Name 


Hamud, Fozia M. 


Total Number of Pages in This Submission 


163 


Attorney Docket Number 


39780-2830 P1C3 



ENCLOSURES (check all that apply) 



I I Fee Transmittal Form 

PI Fee Attached 
I I Amendment / Response 

□ After Final 

I I Version With Markings Showing 
Changes 

EH Affidavits/declaration(s) 

I I Extension of Time Request 

I | Information Disclosure Statement 

I I Certified Copy of Priority Document(s) 

I I Response to Missing Parts/ Incomplete 
Application 

I I Response to Missing 
Parts under 37 CFR 
1.52 or 1.53 

"1 I Copy of Notice 



p~] Copy of an Assignment 

I I Drawing(s) 

|p] Licensing-related Papers 

□ Petition Routing Slip (PTO/SB/69) 
and Accompanying Petition 

I I Petition to Convert to a 
Provisional Application 

|~] Power of Attorney, by Assignee to 
~ ! Exclusion of Inventor Under 37 C.F.R. 
§3.71 With Revocation of Prior Powers 

| | Terminal Disclaimer 
| [ Small Entity Statement 
| | Request for Refund 



Remarks 



□ 
□ 



□ 
□ 



After Allowance Communication to 
Group 

Appeal Communication to Board of 
Appeals and Interferences 
Appeal Communication to Group 
(Appeal Notice, Brief, Reply Brief) 

Request for Oral Hearing 
Status Letter 

ADDITIONAL ENCLOSURE(S) 
(PLEASE IDENTIFY BELOW): 



EVIDENCE APPENDIX ITEMS 1-9; 
AND; RETURN POSTCARD 



AUTHORIZATION TO CHARGE DEPOSIT ACCOUNT 08-1641 FOR ANY FEES DUE IN 
CONNECTION WITH THIS PAPER, REFERENCING ATTORNEY'S DOCKET NO. 39780- 
2830P1C3. 



Firm or 

Individual name 



Signature 



Date 



SIGNATURE OF APPLICANT, ATTORNEY OR AGENT 

HELLER EHRMAN LLP I BARRIE D. GREENE (Reg No. 46,740) " 

275 Middlefield Road, Menlo Park, California 94025 I Telephone: (650)324-7000 I Facsimile: (650)324-0638 



NOVEMBER 22, 2005 



Customer Number: 



35489 



CERTIFICATE OF EXPRESS MAILING 



I hereby certify that this correspondence is being deposited with the United States Postal Service "Express Mail Post Office to Addressee" service under 
37 C.F.R. §1 .10 on the date indicated below and addressed to: MAIL STOP APPEAL BRIEF - PATENTS, Commissioner for Patents, PO Box 1450, 
Alexandria, Virginia 22313-1450, on this date: NOVEMBER 22, 2005 



| Express Mail Label EV 582 622 970 US 



Typed or printed name 



Signature 




ELENA TORRES 

-a 




Date 



NOVEMBER 22, 2005 



Burden Hour Statement This form is estimated to take 0.2 hours to complete. Time will vary depending upon the needs of the individual case. Any comments on the amount of 
time you are required to complete this form should be sent to the Chief Information Officer, Patent and Trademark Office, Washington, DC 20231. DO NOT SEND FEES OR 
COMPLETED FORMS TO THIS ADDRESS. SEND TO: Mail Stop Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450. 




re application of: 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Examiner: Hamud, Fozia M. 



Kevin P. BAKER, et al. 



Art Unit: 1647 



Application Serial No. 10/006,063 



Confirmation No: 8559 



Filed: December 6, 2001 



Attorney's Docket No. 39780-2830 P1C3 



For: SECRETED AND 



Customer No. 35489 



TRANSMEMBRANE 
POLYPEPTIDES AND NUCLEIC 
ACIDS ENCODING THE SAME 



EXPRESS MAIL LABEL NO.: EV 582 622 970 US 
DATE MAILED: November 22, 2005 



ON APPEAL TO THE BOARD OF PATENT APPEALS AND INTERFERENCES 

APPELLANTS' AMENDED BRIEF IN RESPONSE TO NOTICE OF NON- 
COMPLIANT APPEAL BRIEF 



MAIL STOP APPEAL BRIEF - PATENTS 

Commissioner for Patents 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 
Dear Sir: 

On November 29, 2004, the Examiner made a final rejection to pending Claims 28-36 
and 38-40. A Notice of Appeal was filed on February 28, 2005, and an Appeal Brief was filed on 
July 27, 2005. 

A Notification of Non-Compliant Appeal Brief was mailed October 31, 2005, which 
stated that the brief was defective because required elements were missing. The following 
amended appeal brief has been corrected to include all headings and sections as required under 
37C.F.R. §41. 37(a). 

The following constitutes the amended version of Appellants' Brief on Appeal. 
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1. REAL PARTY IN INTEREST 

The real party in interest is Genentech, Inc., South San Francisco, California, by an 
assignment of the patent application U.S. Serial No. 09/946,374 recorded January 8, 2002, at 
Reel 012288 and Frame 0504. 

2. RELATED APPEALS AND INTERFERENCES 

The claims pending in the current application are directed to a polypeptide referred to 
herein as "PR01293". There exist two related patent applications, (1) U.S. Serial No. 
10/015,869, filed December 11, 2001 (containing claims directed to polynucleotides encoding 
PR01293 polypeptides), and (2) U.S. Serial No. 10/006,818, filed December 6, 2001 (containing 
claims directed to antibodies that bind PR01293 polypeptides). The 10/015,869 application is 
still pending. The 10/006,818 application is also under final rejection from the same Examiner 
and based upon the same outstanding rejection, and appeal of this final rejection is being pursued 
independently and concurrently herewith. 

3. STATUS OF CLAIMS 

Claims 28-36 and 38-40 are in this application. 
Claims 1-27 and 37 are canceled. 

Claims 28-36 and 38-40 stand rejected and Appellants appeal the rejection of these 

claims. 

A copy of the rejected claims involved in the present Appeal is provided in the Claims 
Appendix. 

4. STATUS OF AMENDMENTS 

There were no amendments to the claims submitted after final rejection. All previous 
amendments to the claims have been entered. 
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5. SUMMARY OF CLAIMED SUBJECT MATTER 

The invention claimed in the present application is related to an isolated polypeptide 
comprising the amino acid sequence of the polypeptide of SEQ ID NO:77; the amino acid 
sequence of the polypeptide-of SEQ ID NO:77, lacking its associated signal peptide; the amino 
acid sequence of the extracellular domain of the polypeptide of SEQ ID NO: 77; or the amino 
acid sequence of the polypeptide encoded by the full-length coding sequence of the cDNA 
deposited under ATCC accession number 203292 (Claims 33-36 and 38). The invention is 
further directed to polypeptides having at least 80%, 85%, 90%, 95%, or 99% amino acid 
sequence identity to the amino acid sequence of the polypeptide of SEQ ID NO:77; the amino 
acid sequence of the polypeptide-of SEQ ID NO: 77, lacking its associated signal peptide; the 
amino acid sequence of the extracellular domain of the polypeptide of SEQ ID NO:77; or the 
amino acid sequence of the polypeptide encoded by the full-length coding sequence of the cDNA 
deposited under ATCC accession number 203292, wherein the nucleic acid encoding said 
polypeptide is amplified in lung or colon tumor (Claims 28-32). The invention is further directed 
to a chimeric polypeptide comprising one of the above polypeptides fused to a heterologous 
polypeptide (Claim 39), and to a chimeric polypeptide wherein the heterologous polypeptide is 
an epitope tag or an Fc region of an immunoglobulin (Claim 40). 

The full-length PR01293 polypeptide having the amino acid sequence of SEQ ID NO:77 
is described in the specification at, for example, page 8, lines 2-13, page 338, lines 1-5, Example 
26, in Figure 46 and in SEQ ID NO:77. The cDNA nucleic acid encoding PR01293 is described 
in the specification at, for example, Example 26, in Figure 45 and in SEQ ED NO:76. Page 287, 
lines 20-24 of the specification provides the description for Figures 45 and 46. PRO polypeptide 
variants having at least about 80% amino acid sequence identity with a full length PRO 
polypeptide sequence, a PRO polypeptide sequence lacking the signal peptide, or an extracellular 
domain of a PRO polypeptide are described in the specification at, for example, page 302, lines 
4-26. The preparation of chimeric PRO polypeptides, including those wherein the heterologous 
polypeptide is an epitope tag or an Fc region of an immunoglobulin, is set forth in the 
specification at page 358, lines 1 1-34. Examples 128-131 describe the expression of PRO 
polypeptides in various host cells, including E. coli, mammalian cells, yeast and Baculovirus- 
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infected insect cells. PRO 1293 is described as having amino acid sequence identity with the 
human Ig heavy chain V region protein and as being a newly identified member of the Ig 
superfamily of proteins (see, for example, page 338, lines 1-5). Finally, Example 143, in the 
specification at page 494, line 20, to page 508, line 28, sets forth a Gene Amplification assay 
which shows that the PRO 1293 gene is amplified in the genome of certain human lung and colon 
cancers (see page 507, lines 5-12, and Table 8). 

6. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

I. Whether Claims 28-36 and 38-40 satisfy the utility requirement of 35 U.S.C 

.§101. 

II. Whether Claims 28-36 and 38-40 satisfy the enablement requirement of 35 U.S.C. 
§112, first paragraph. 

HI. Whether Claims 28-32 satisfy the written description requirement of 35 U.S.C. 
§112, first paragraph. 

IV. Whether Claims 28-36 and 38-40 are patentable under 35 U.S.C. §102(a) over 
Botstein et al. 9 WO200053751 and Baker et ai, WO200012708. 

7. ARGUMENTS 
Summary of the Arguments: 
Issue I: Utility 

Claims 28-36 and 38-40 stand rejected under 35 U.S.C. §101 as allegedly lacking either a 
specific and substantial asserted utility or a well established utility. Appellants have previously 
explained that patentable utility of the PRO 1293 polypeptides is based upon the gene 
amplification data for the gene encoding the PRO 1293 polypeptide. The specification discloses 
that the gene encoding PRO 1293 showed significant amplification, ranging from 2.2 to 5 fold , in 
3 different lung and colon tumors . Appellants have also submitted, with their Response filed 
September 9, 2004, the Declaration of Dr. Audrey Goddard, which explains that a gene identified 
as being amplified at least 2-fold by the disclosed gene amplification assay in a tumor sample 
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relative to a normal sample is useful as a marker for the diagnosis of cancer, for monitoring 
cancer development and/or for measuring the efficacy of cancer therapy. 

The Examiner has asserted that "the instant specification does not demonstrate that the 
increased copy number of PRO 1293 in lung and colon tumors leads to an increased expression of 
PR01293 in these tumors." (Page 4 of the Office Action mailed November 29, 2004). In 
support of this assertion, the Examiner has cited a reference by Hu et al as evidence that "gene 
amplification does not necessarily result in increased expression at the mRNA and polypeptide 
levels" (Page 4 of the Office Action mailed November 29, 2004; emphasis added). The 
Examiner has further cited Pennica et al in support of the assertion that "protein levels cannot be 
accurately predicted from the level of the corresponding gene." (Page 5 of the Office Action 
mailed May 13,2004). 

Appellants submit that the Examiner applied an improper legal standard when making 
this rejection. The evidentiary standard to be used throughout ex parte examination in setting 
forth a rejection is a preponderance of the totality of the evidence under consideration. Thus, to 
overcome the presumption of truth that an assertion of utility by the applicant enjoys, the 
Examiner must establish that it is more likely than not that one of ordinary skill in the art would 
doubt the truth of the statement of utility. Only after the Examiner has made a proper prima facie 
showing of lack of utility, does the burden of rebuttal shift to the applicant. 

The two sole references cited by the Examiner do not suffice to make a prima facie case 
that more likely than not no generalized correlation exists between gene (DNA) amplification 
and increased polypeptide levels. In particular, the teachings of Pennica et al are not directed 
towards genes in general but to genes within a single family and thus, these teachings cannot 
support a general conclusion regarding correlation between gene amplification and mRNA or 
protein levels. Nor does Hu et al suffice to show that a lack of correlation between gene 
amplification data and the biological significance of cancer genes is typical . 

In contrast, Appellants have . submitted ample evidence to show that, in general, if a gene 
is amplified in cancer, it is more likely than not that the encoded protein will be expressed at an 
elevated level. First, the articles by Orntoft et al, Hyman et al, and Pollack et al (made of 
record in Appellants 1 Response filed September 9, 2004) collectively teach that in general gene 
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amplification increases mRNA expression . Second, the Declaration of Dr. Paul Polakis, 
principal investigator of the Tumor Antigen Project of Genentech, Inc., the assignee of the 
present application, shows that, in general there is a correlation between mRNA levels and 
polypeptide levels . Appellants further note that the sale of gene expression chips to measure 
mRNA levels is a highly successful business, with a company such as Affymetrix recording 
168.3 million dollars in sales of their GeneChip arrays in 2004. Clearly, the research community 
believes that the information obtained from these chips is useful (i.e., that it is more likely than 
not informative of the protein level). 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is a correlation between DNA, mRNA, 
and polypeptide levels, these instances are exceptions rather than the rule . In the majority of 
amplified genes , as exemplified by Orntoft et ai, Hyman et al, Pollack et aL, and the Polakis 
Declaration, the teachings in the art overwhelmingly show that gene amplification influences 
gene expression at the mRNA and protein levels . Therefore, one of skill in the art would 
reasonably expect in this instance, based on the amplification data for the PRO 1293 gene, that 
the PRO 1293 polypeptide is concomitantly overexpressed. Thus, the claimed PRO 1293 
polypeptides have utility in the diagnosis of cancer. 

Appellants further submit that even if there is no correlation between gene amplification 
and increased mRNA/protein expression, (which Appellants expressly do not concede), a 
polypeptide encoded by a gene that is amplified in cancer would still have a specific, substantial, 
and credible utility. Appellants submit that, as evidenced by the Ashkenazi Declaration and the 
teachings of Hanna and Mornin (submitted with Appellants 1 Response filed September 9, 2004), 
simultaneous testing of gene amplification and gene product over-expression enables more 
accurate tumor classification , even if the gene-product, the protein, is not over-expressed. This 
leads to better determination of a suitable therapy for the tumor as demonstrated by the real- 
world example of the breast cancer marker HER-2/neu. 

Accordingly, Appellants submit that when the proper legal standard is applied, one 
should reach the conclusion that the present application discloses at least one patentable utility 
for the claimed PRO 1293 polypeptides. 
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Issue II: Enablement 

Claims 28-36 and 38-40 stand rejected under 35 U.S.C. §112, first paragraph, allegedly 
"since the claimed invention is not supported by either a specific and substantial asserted utility or a 
well established utility for the reasons set forth above, one skilled in the art clearly would not know 
how to use the claimed invention." (Page 4 of the Office Action mailed November 29, 2004). 
Claims 28-32 further stand rejected under 35 U.S.C. §112, first paragraph as allegedly lacking 
enablement for the claimed polypeptide variants. 

Appellants submit that, as discussed above, the PRO 1293 polypeptides have utility in the 
diagnosis of cancer. Based on such a utility, one of skill in the art would know exactly how to 
use the claimed polypeptides for diagnosis of cancer, without any undue experimentation. 

Appellants note that the claimed variants, in addition to having at least 80% amino acid 
sequence identity to SEQ ID NO:77, also have the functional limitation that " the nucleic acid 
encoding said polypeptide is amplified in lung or colon tumors ." Thus the claimed variants all 
share the disclosed utility of the PRO 1293 polypeptide in the diagnosis of cancer . The 
specification provides ample guidance to allow the skilled artisan to identify those polypeptide 
variants which meet the limitations of the claims, including a detailed protocol for the gene 
amplification assay. The specification also provides detailed guidance as to how to identify and 
make polypeptides having at least 80% amino acid sequence identity to PRO 1293 (SEQ ID 
NO:77). Accordingly, one of ordinary skill in the art would understand how to make and use the 
recited polypeptide variants without any undue experimentation. 

Issue III: Written Description 

Claims 28-32 stand rejected under 35 U.S.C. §112, first paragraph as allegedly lacking 
adequate written description for the claimed variant polypeptides. In particular, the Examiner 
has asserted that "the claims are not defined by structure and functional identity." (Page 12 of the 
Office Action mailed November 29, 2004). 

Appellants note that the claims recite structural features, namely, 80% sequence identity 
to SEQ ID NO:77, which are common to the genus. The specification provides detailed guidance 
as to how to identify the recited variants of SEQ ID NO:77, including methods for determining 
percent identity between two amino acid sequences, as well as listings of exemplary and 
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preferred sequence substitutions. The genus of claimed polypeptides is further defined by having 
a specific functional activity for the encoding nucleic acids, namely, that the encoding nucleic 
acid is amplified in lung and colon tumors. Example 143 of the present application provides 
step-by-step guidelines and protocols for a gene amplification assay. By following the disclosure 
in the specification, one skilled in the art can easily test whether a gene encoding a variant 
PRO 1293 protein is amplified in lung or colon tumors. Accordingly, one of skill in the art could 
identify whether the variant PRO 1293 sequence falls within the parameters of the claimed 
invention. 

Accordingly, a description of the claimed genus has been achieved by the recitation of 
both structural and functional characteristics. 

Issue IV: Anticipation by Botstein et aU WO 2000053751 and/or Baker et al. WO 
200012708 

Claims 28-36 and 38-40 stand rejected under 35 U.S.C. §102(a) as being anticipated by 
Botstein et aL, WO 2000053751, published on September 14, 2000, and by Baker et al., WO 
200012708, published on March 9, 2000. 

The instant application claims priority to U.S. Provisional Application Serial No. 
60/162,506, filed on October 29, 1999, over ten months before the publication date of Botstein et 
aL and over four months before the publication date of Baker et aL The instant application has 
not been granted the earlier priority date on the grounds that "the parent application does not 
teach how to use the claimed invention in a manner that satisfies the requirements under 35 
U.S.C. 112, first paragraph." (Page 13 of the Office Action mailed November 29, 2004). 
Appellants respectfully submit that as discussed above under Issues I and n, the presently 
claimed invention is supported by a specific, substantial and credible utility and, therefore, the 
present specification teaches one of ordinary skill in the art "how to use" the claimed invention 
without undue experimentation. Accordingly, the instant application is entitled to the effective 
filing date of October 29, 1999. and thus neither Botstein et aL nor Baker et aL is prior art. 

These arguments are all discussed in further detail below under the appropriate headings. 
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ISSUE I: Claims 28-36 and 38-40 satisfy the utility requirement of 35 U.S.C. $101 

Claims 28-36 and 38-40 stand rejected under 35 U.S.C. §101 because allegedly "the 
claimed invention is not supported by either a specific and substantial asserted utility or a well 
established utility." (Page 3 of the Office Action mailed November 29, 2004). 

Appellants submit, for the reasons set forth below, that the specification discloses at least 
one credible, substantial and specific asserted utility for the claimed PRO 1293 polypeptides. 

A. The Legal Standard for Utility 

According to 35 U.S.C. § 101: 

Whoever invents or discovers any new and useful process, machine, manufacture, or 
composition of matter, or any new and useful improvement thereof, may obtain a 
patent therefor, subject to the conditions and requirements of this title. (Emphasis 
added). 

In interpreting the utility requirement, in Brenner v. MansonJ the Supreme Court held 
that the quid pro quo contemplated by the U.S. Constitution between the public interest and the 
interest of the inventors required that a patent applicant disclose a "substantial utility" for his or 
her invention, i.e. a utility "where specific benefit exists in currently available form." The Court 
concluded that "a patent is not a hunting license. It is not a reward for the search, but 
compensation for its successful conclusion. A patent system must be related to the world of 
commerce rather than the realm of philosophy." 3 

Later, in Nelson v. Bowler, 4 the C.C.P.A. acknowledged that tests evidencing 
pharmacological activity of a compound may establish practical utility, even though they may not 
establish a specific therapeutic use. The court held that "since it is crucial to provide researchers 
with an incentive to disclose pharmaceutical activities in as many compounds as possible, we 
conclude adequate proof of any such activity constitutes a showing of practical utility." 5 

1 Brenner v. Manson, 383 U.S. 519, 148 U.S.P.Q. (BNA) 689 (1966). 

2 Id. at 534, 148 U.S.P.Q. (BNA) at 695. 

3 Id. at 536, 148 U.S.P.Q. (BNA) at 696. 

4 Nelson v. Bowler, 626 F.2d 853, 206 U.S.P.Q. (BNA) 881 (C.C.P.A. 1980). 

5 Id. at 856, 206 U.S.P.Q. (BNA) at 883. 
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In Cross v. Iizuka, 6 the C.A.F.C. reaffirmed Nelson, and added that in vitro results might 
be sufficient to support practical utility, explaining that "in vitro testing, in general, is relatively 
less complex, less time consuming, and less expensive than in vivo testing. Moreover, in vitro 
results with the particular pharmacological activity are generally predictive of in vivo test results, 
i.e. there is a reasonable correlation there between." 7 The court perceived "No insurmountable 
difficulty" in finding that, under appropriate circumstances, "in vitro testing, may establish a 
practical utility." 8 

The case law has also clearly established that Appellants' statements of utility are usually 
sufficient, unless such statement of utility is unbelievable on its face. 9 The PTO has the initial 
burden to prove that Appellants' claims of usefulness are not believable on their face. In 
general, an Applicant's assertion of utility creates a presumption of utility that will be sufficient 
to satisfy the utility requirement of 35 U.S.C. § 101 , "unless there is a reason for one skilled in the 
art to question the objective truth of the statement of utility or its scope." 11,12 

Compliance with 35 U.S.C. §101 is a question of fact. 13 The evidentiary standard to be 
used throughout ex parte examination in setting forth a rejection is a preponderance of the 
totality of the evidence under consideration. 14 Thus, to overcome the presumption of truth that 
an assertion of utility by the applicant enjoys, the Examiner must establish that it is more likely 
than not that one of ordinary skill in the art would doubt the truth of the statement of utility. 



6 Cross v. Iizuka, 753 F.2d 1047, 224 U.S.P.Q. (BNA) 739 (Fed. Cir. 1985). 

7 Id. at 1050, 224 U.S.P.Q. (BNA) at 747. 

8 Id.. 

9 In re Gazave, 379 F.2d 973, 154 U.S.P.Q. (BNA) 92 (C.C.P.A. 1967). 

10 Ibid. 

11 In re Longer, 503 F.2d 1380,1391, 183 U.S.P.Q. (BNA) 288, 297 (C.C.P.A. 1974). 

12 See also In re Jolles, 628 F.2d 1322, 206 U.S.P.Q. 885 (C.C.P.A. 1980); In re Irons, 340 F.2d 974, 144 
U.S.P.Q. 351 (1965); In re Sichert, 566 F.2d 1 154, 1 159, 196 U.S.P.Q. 209, 212-13 (C.C.P.A. 1977). 

13 Raytheon v. Roper, 724 F.2d 951, 956, 220 U.S.P.Q. (BNA) 592, 596 (Fed. Cir. 1983) cert, denied, 469 
US 835 (1984). 

14 In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d (BNA) 1443, 1444 (Fed. Cir. 1992). 
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Only after the Examiner made a proper prima facie showing of lack of utility, does the. burden of 
rebuttal shift to the applicant. The issue will then be decided on the totality of evidence. 

The well established case law is clearly reflected in the Utility Examination Guidelines 
("Utility Guidelines") 15 , which acknowledge that an invention complies with the utility 
requirement of 35 U.S.C. §101, if it has at least one asserted "specific, substantial, and credible 
utility" or a "well-established utility." Under the Utility Guidelines, a utility is "specific" when it 
is particular to the subject matter claimed. For example, it is generally not enough to state that a 
nucleic acid is useful as a diagnostic without also identifying the conditions that are to be 
diagnosed. 

In explaining the "substantial utility" standard, M.P.E.P. §2107.01 cautions, however, 
that Office personnel must be careful not to interpret the phrase "immediate benefit to the public" 
or similar formulations used in certain court decisions to mean that products or services based on 
the claimed invention must be "currently available" to the public in order to satisfy the utility 
requirement. "Rather, any reasonable use that an applicant has identified for the invention that 
can be viewed as providing a public benefit should be accepted as sufficient, at least with regard 
to defining a 'substantial' utility." 16 Indeed, the Guidelines for Examination of Applications for 
Compliance With the Utility Requirement, 17 gives the following instruction to patent examiners: 
"If the applicant has asserted that the claimed invention is useful for any particular practical 
purpose . . . and the assertion would be considered credible by a person of ordinary skill in the 
art, do not impose a rejection based on lack of utility." 

B. Proper Application of the Legal Standard 

Appellants respectfully submit that Appellants rely on the gene amplification data for 
patentable utility of the claimed PRO 1293 polypeptides, and that the gene amplification data for 



15 66 Fed. Reg. 1092(2001). 

16 M.P.E.P. §2107.01. 

17 M.P.E.P. §2107 II (B)(1). 
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the gene encoding the PRO 1293 polypeptide is clearly disclosed in the instant specification under 
Example 143. 

It was well known in the art at the time the invention was made that gene amplification is 
an essential mechanism for oncogene activation. The gene amplification assay is well-described 
in Example 143 of the present application. Example 143 discloses that the inventors isolated 
genomic DNA from a variety of primary cancers and cancer cell lines that are listed in Table 8, 
including primary lung and colon tumors of the type and stage indicated in Table 7. As a 
negative control, DNA was isolated from the cells of ten normal healthy individuals, which was 
pooled and used as a control. Gene amplification was monitored using real-time quantitative 
TaqMan™ PCR. Table 8 shows the resulting gene amplification data. Further, Example 143 
explains that the results of TaqMan™ PCR are reported in ACt units, wherein one unit 
corresponds to one PCR cycle or approximately a 2-fold amplification relative to control, two 
units correspond to 4-fold amplification, 3 units to 8-fold amplification etc. 

Appellants respectfully submit that a ACt value of at least 1.0 was observed for PRO 1293 
in at least three of the tumors listed in Table 8. PR01293 showed approximately 1.71 ACt units 
which corresponds to 2 1 71 - fold amplification or 3.272-fold amplification in primary lung tumor 
(HF-000840), and approximately 1.13-2.33 ACt units which corresponds.to 2 l 13 -2 2 33 - fold 
amplification or 2.189 fold to 5.028-fold amplification in colon tumors (HF-000539 and HF- 
000795). (See Table 8 and page 507, lines 5-12 of the specification). Accordingly, the present 
specification clearly discloses overwhelming evidence that the gene encoding the PRO 1293 
polypeptide is significantly amplified in lung and colon tumors. 

It is also well known that gene amplification occurs in most solid tumors, and generally is 
associated with poor prognosis. 

In support, Appellants have submitted, in their Response filed September 9, 2004, a 

Declaration by Dr. Audrey Goddard. Appellants particularly draw the Board's attention to page 3 

of the Goddard Declaration which clearly states that: 

It is further my considered scientific opinion that an at least 2-fold increase in 
gene copy number in a tumor tissue sample relative to a normal (i.e., non-tumor) 
sample is significant and useful in that the detected increase in gene copy 
number in the tumor sample relative to the normal sample serves as a basis for 
using relative gene copy number as quantitated by the TaqMan PCR technique 
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as a diagnostic marker for the presence or absence of tumor in a tissue sample of 
unknown pathology. Accordingly, a gene identified as being amplified at least 
2-fold by the quantitative TaqMan PCR assay in a tumor sample relative to a 
normal sample is useful as a marker for the diagnosis of cancer, for 
monitoring cancer development and/or for measuring the efficacy of cancer 
therapy. (Emphasis added). 

As indicated above, the gene encoding the PRO 1293 polypeptide shows at least a two 
fold amplification in three different lung and colon tumors. In addition, the Goddard Declaration 
clearly establishes that the TaqMan real-time PCR method described in Example 143 has gained 
wide recognition for its versatility, sensitivity and accuracy, and is in extensive use for the study 
of gene amplification. The facts disclosed in the Declaration also confirm that based upon the 
gene amplification results, one of ordinary skill would find it credible that PR01293 is a 
diagnostic marker of lung and colon cancer. 

The Examiner has asserted that "[t]he asserted utilities of cancer diagnostics for the 
claimed antibody that binds to the polypeptide of SEQ ID NO:77, are credible and specific. 
However, they are not substantial. The data set forth in the specification are preliminary at best." 
(Pages 5-6 of the Office Action mailed November 29, 2004). 

As stated above, in explaining the "substantial utility" standard, M.P.E.P. §2107.01 
cautions that Office personnel must be careful not to interpret the phrase "immediate benefit to 
the public" or similar formulations used in certain court decisions to mean that products or 
services based on the claimed invention must be "currently available" to the public in order to 
satisfy the utility requirement. Indeed, the Guidelines for Examination of Applications for 
Compliance With the Utility Requirement 18 states, "If the applicant has asserted that the claimed 
invention is useful for any particular practical purpose . . . and the assertion would be considered 
credible by a person of ordinary skill in the art, do not impose a rejection based on lack of 
utility." 

Appellants' position is based on the overwhelming evidence from gene amplification data 
disclosed in the specification which clearly indicate that the gene encoding PRO 1293 is 
significantly amplified in certain lung and colon tumors. Based on the working hypothesis 

18 M.P.E.P. §2107 II (B)(1). 
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among those skilled in the art that if a gene is amplified in cancer, the encoded protein is likely to 
be expressed at an elevated level one skilled in the art would simply accept that since the 
PRO 1293 gene is amplified, the PRO 1293 polypeptide would be more likely than not over- 
expressed. Thus data relating to PRO 1293 polypeptide expression may be used for the same 
diagnostic and prognostic purposes as data relating to PRO 1293 gene expression. Therefore, 
based on the disclosure in the specification, no further research would be necessary to determine 
how to use the claimed PRO 1293 polypeptides, because the current invention is fully enabled by 
the disclosure of the present application. 

Accordingly, Appellants submit that based on the general knowledge in the art at the time 
the invention was made and the teachings in the specification, the specification provides clear 
guidance as to how to interpret and use the data relating to the PRO 1293 polypeptide expression 
and that the PRO 1293 polypeptides have utility in the diagnosis of cancer. 

C. A prima facie case of lack of utility has not been established 

The Examiner has asserted that the "the instant specification does not demonstrate that 
the increased copy number of PRO 1293 DNA in lung and colon tumors, leads to an increased 
expression of PR01293 polypeptide in these tumors." (Page 4 of the Office Action mailed 
November 29, 2004). The Examiner concludes that "since Appellants do not provide 
information regarding the level of expression, an activity, or a role in cancer or any other disease 
for the claimed PRO 1293 polypeptide, the polypeptide lacks a substantial activity or well 
established utility." (Page 5 of the Office Action mailed November 29, 2004). 

The Examiner has cited Pennica et al. in support of the assertion that "protein levels 
cannot be accurately predicted from the level of the corresponding gene." (Page 5 of the Office 
Action mailed May 13, 2004). The Examiner has further cited Hu et al, in support of the 
assertion that "the literature reports that gene amplification does not necessarily result in 
increased expression at the mRNA and polypeptide levels." (Page 5 of the Office Action mailed 
November 29, 2004; emphasis added). 

As a preliminary matter, Appellants respectfully submit that it is not a legal requirement 
to establish that gene amplification "necessarily" results in increased expression at the mRNA 
and polypeptide levels, or that protein levels can be "accurately predicted." As discussed above, 
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the evidentiary standard to be used throughout ex parte examination of a patent application is a 
preponderance of the totality of the evidence under consideration. Accordingly, Appellants 
submit that in order to overcome the presumption of truth that an assertion of utility by the 
applicant enjoys, the Examiner must establish that it is more likely than not that one of ordinary 
skill in the art would doubt the truth of the statement of utility. Therefore, it is not legally 
required that there be a "necessary" correlation between the data presented and the claimed 
subject matter. The law requires only that one skilled in the art should accept that such a 
correlation is more likely than not to exist . Appellants respectfully submit that when the proper 
evidentiary standard is applied, a correlation must be acknowledged. 

Appellants submit that Pennica et al does not show a lack of correlation between gene 
(DNA) amplification and mRNA levels. According to the quoted statement from Pennica et al, 
"WISP-l gene amplification in human colon tumors showed a correlation between DNA 
amplification and over-expression, whereas overexpression of WISP-3 RNA was seen in the 
absence of DNA amplification. In contrast, WISP-2 DNA was amplified in colon tumors, but its 
mRNA expression was significantly reduced in the majority of tumors compared with expression 
in normal colonic mucosa from the same patient." From this, the Examiner correctly concludes 
that increased copy number does not necessarily result in increased polypeptide expression. The 
standard, however, is not absolute certainty. The fact that in the case of a specific class of closely 
related molecules there seemed to be no correlation with gene amplification and the level of 
mRNA/protein expression, does not establish that it is more likely than not, in general, that such 
correlation does not exist. The Examiner has not shown whether the lack or correlation observed^ 
for the family of WISP polypeptides is typical, or is merely a discrepancy, an exception to the 
rule of correlation . Indeed, the working hypothesis among those skilled in the art is that, if a 
gene is amplified in cancer, the encoded protein is likely to be expressed at an elevated level. In 
fact, as noted even in Pennica et al, "[a]n analysis of WISP A gene amplification and expression 
in human colon tumors showed a correlation between DNA amplification and over-expression . . 
. " (Pennica et al, pagel4722, left column, first full paragraph, emphasis added). 

Accordingly, Appellants respectfully submit that Pennica et al teaches nothing 
conclusive regarding the absence of correlation between amplification of a gene and over- 
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expression of the encoded WISP polypeptide. More importantly, the teaching of Pennica et al is 
specific to WISP genes. Pennica et al has no teaching whatsoever about the correlation of gene 
amplification and protein expression in general . 

The Examiner futher cites Hu et al to the effect that genes displaying a 5-fold change or 
less in mRNA expression in tumors compared to normal showed no evidence of a correlation 
between altered gene expression and a known role in the disease. However, among genes with a 
10-fold or more change in expression level, there was a strong and significant correlation 
between expression level and a published role in the disease. 

Appellants submit that in order to overcome the presumption of truth that an assertion of 
utility by the applicant enjoys, the Examiner must establish that it is more likely than not that one 
of ordinary skill in the art would doubt the truth of the statement of utility. Accordingly, contrary 
to the Examiner's assertion, Appellants submit that Hu et al does not conclusively show that it is 
more likely than not that gene amplification does not result in increased expression at the mRNA 
and polypeptide levels. First, the title of Hu et al is "Analysis of Genomic and Proteomic Data 
Using Advanced Literature Mining." As the title clearly suggests, the conclusion suggested by 
Hu et al is merely based on a statistical analysis of the information disclosed in the published 
literature. As Hu et al states, "We have utilized a computational approach to literature mining to 
produce a comprehensive set of gene-disease relationships." In particular, Hu et al relied on the 
MedGene Database and the Medical Subject Heading (MeSH) files to analyze the gene-disease 
relationship. More specifically, Hu et al "compared the MedGene breast cancer gene list to a 
gene expression data set generated from a micro-array analysis comparing breast cancer and 
normal breast tissue samples." (See page 408, right column). 

Therefore, Appellants first submit that the reference by Hu et al only studies the 
statistical analysis of micro-array data and not gene amplification data. Therefore, their findings 
would not be directly applicable to gene amplification data. In addition, Appellants respectfully 
submit that the Hu et al reference does not show that a lack of correlation between microarray 
data and the biological significance of cancer genes is typical. 

According to Hu et al, "different statistical methods" were applied to "estimate the 
strength of gene-disease relationships and evaluated the results." (See page 406, left column, 
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emphasis added). Using these different statistical methods, Hu et al "[a]ssessed the relative 
strengths of gene-disease relationships based on the frequency of both co-citation and single 
citation." (See page 41 1, left column). It is well known in the art that various statistical methods 
allow different variables to be manipulated to affect the outcome. For example, the authors 
admit, "Initial attempts to search the literature using" the list of genes, gene names, gene 
symbols, and frequently used synonyms, generated by the authors "revealed several sources of 
false positives and false negatives." (See page 406, right column). The authors further admit that 
the false positives caused by "duplicative and unrelated meanings for the term" were "difficult to 
manage." Therefore, in order to minimize such false positives, Hu et al disclose that these terms 
"had to be eliminated entirely, thereby reducing the false positive rate but unavoidably under- 
representing some genes. " Id. Hence, Appellants respectfully submit that in order to minimize 
the false positives and negatives in their analysis, Hu et al manipulated various aspects of the 
input data. 

Appellants further submit that the statistical analysis by Hu et al is not a reliable standard 
because the frequency of citation reflects only the current research interest of a molecule rather 
than the true biological function of the molecule. Indeed, the authors acknowledge that 
"[relationship established by frequency of co-citation do not necessarily represent a true 
biological link." (See page 411, right column). It often happens in scientific study that important 
molecules are overlooked by the scientific society for many years until the discovery of their true 
function. Therefore, Appellants submit that Hu et al drew their conclusion based on a very 
unreliable standard and that their research does not provide any meaningful information 
regarding the correlation between microarray data and the biological significance of a molecule. 

Even assuming that Hu et al provide evidence to support a true relationship, the 
conclusion in Hu et al only applies to a specific type of breast tumor (estrogen receptor 
(ER)-positive breast tumor) and can not be generalized as a principle governing microarray study 
of breast cancer in general, let alone the various other types of cancer genes in general . In fact, 
even Hu et al admit that., "[i]t is likely that this threshold will change depending on the disease 
as well as the experiment. Interestingly, the observed correlation was only found among 
ER-positive (breast) tumors not ER-negative tumors." (See page 412, left column). Therefore, 
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based on these findings, the authors add, "This may reflect a bias in the literature to study the 
more prevalent type of tumor in the population. Furthermore, this emphasizes that caution must 
be taken when interpreting experiments that may contain subpopulations that behave very 
differently." Id. (Emphasis added). 

In summary, Appellants respectfully submit that the Examiner has not shown that a lack 
of correlation between microarray data and the biological significance of cancer genes, as 
observed for ER-positive breast tumor, is typical Since the standard is not absolute certainty, a 
prima facie showing of lack of utility has not been made in this instance. The Patent Office has 
failed to meet its initial burden of proof that Appellants 1 claims of utility are not substantial or 
credible. The arguments presented by the Examiner in combination with the Pennica et ah and 
Hu et ah articles do not provide sufficient reasons to doubt the statements by Appellants that 
PRO 1293 has utility. As discussed above, the law does not require that gene amplification 
"necessarily" results in increased expression at the mRNA and polypeptide levels, or that protein 
levels must be "accurately predicted." Therefore, Appellants submit that the Examiner's 
reasoning is based on a misrepresentation of the scientific data presented in the above cited 
references and application of an improper, heightened legal standard. In fact, contrary to what 
the Examiner contends, the art indicates that, if a gene is amplified in cancer, it is more likely 
than not that the encoded protein will be expressed at an elevated level. 

D. It is "more likely than not" for amplified genes to have increased mRNA and 
protein levels 

Appellants have submitted ample evidence to show that, in general, if a gene is amplified 
in cancer, it is more likely than not that the encoded protein will be expressed at an elevated 
level. First, the articles by Orntoft et ah, Hyman et al. 9 and Pollack et ah, (made of record in 
Appellants 1 Response filed October 25, 2004) collectively teach that in general, gene 
amplification increases mRNA expression . Second, the Declaration of Dr. Paul Polakis, 
principal investigator of the Tumor Antigen Project of Genentech, Inc., the assignee of the 
present application, shows that, in general there is a correlation between mRNA levels and 
polypeptide levels . Thus, taken together, all of the submitted evidence supports Appellants' 
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position that gene amplification is more likely than not predictive of increased mRNA and 
polypeptide levels. 

Appellants submit that there are numerous articles which show that generally, if a gene is 
amplified in cancer, it is more likely than not that the mRNA transcript will be expressed at an 
elevated level. For example, Orntoft et al (Mol and Cell. Proteomics, 2002, vol. 1, pages 37-45 
- made of record in Appellants 1 Response filed September 9, 2004) studied transcript levels of 
5600 genes in malignant bladder cancers, many of which were linked to the gain or loss of 
chromosomal material using an array-based method. Orntoft et al. showed that there was a gene 
dosage effect and taught that "in general (18 of 23 cases) chromosomal areas with more than 2- 
fold gain of DNA showed a corresponding increase in mRNA transcripts" (see column 1, 
abstract). In addition, Hyman et al. {Cancer Res., 2002, vol. 62, pages 6240-45 - made of record 
in Appellants 1 Response filed September 9, 2004) showed, using CGH analysis and cDNA 
microarrays which compared DNA copy numbers and mRNA expression of over 12,000 genes in 
breast cancer tumors and cell lines, that there was "evidence of a prominent global influence of 
copy number changes on gene expression levels." (See page 6244, column 1, last paragraph). 
Additional supportive teachings were also provided by Pollack et al, (PNAS, 2002, vol. 99, 
pages 12963-12968 - made of record in Appellants' Response filed September 9, 2004) who 
studied a series of primary human breast tumors and showed that ".. .62% of highly amplified 
genes show moderately or highly elevated expression, and DNA copy number influences gene 
expression across a wide range of DNA copy number alterations (deletion, low-, mid- and high- 
level amplification), and that on average, a 2-fold change in DNA copy number is associated 
with a corresponding 1 .5-fold change in mRNA levels." Thus, these articles collectively teach 
that in general gene amplification increases mRNA expression . 

In addition, in their Response filed September 9, 2004, Appellants submitted a 
Declaration by Dr. Polakis, principal investigator of the Tumor Antigen Project of Genentech, 
Inc., the assignee of the present application, to show that mRNA expression correlates well with 
protein levels, in general. As Dr. Polakis explains, the primary focus of the microarray project 
was to identify tumor cell markers useful as targets for both the diagnosis and treatment of cancer 
in humans. The scientists working on the project extensively rely on results of microarray 
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experiments in their effort to identify such markers. As Dr. Polakis explains, using microarray 
analysis, Genentech scientists have identified approximately 200 gene transcripts (mRNAs) that 
are present in human tumor cells at significantly higher levels than in corresponding normal 
human cells. To the date of the Declaration, they have generated antibodies that bind to about 30 
of the tumor antigen proteins expressed from these differentially expressed gene transcripts and 
have used these antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. Having compared 
the levels of mRNA and protein in both the tumor and normal cells analyzed, they found a very 
good correlation between mRNA and corresponding protein levels. Specifically, in 
approximately 80% of their observations they have found that increases in the level of a 
particular mRNA correlates with changes in the level of protein expressed from that mRNA. 
While the proper legal standard is to show that the existence of correlation between mRNA and 
polypeptide levels is more likely than not, the showing of approximately 80% correlation for the 
molecules tested according to the Polakis Declaration greatly exceeds this legal standard. Based 
on these experimental data and his vast scientific experience of more than 20 years, Dr. Polakis 
states that, for human genes, increased mRNA levels typically correlate with an increase in 
abundance of the encoded protein. He further confirms that "it remains a central dogma in 
molecular biology that increased mRNA levels are predictive of corresponding increased levels 
of the encoded protein." 

Appellants further note that the sale of gene expression chips to measure mRNA levels is 
a highly successful business, with a company such as Affymetrix recording 168.3 million dollars 
in sales of their GeneChip arrays in 2004. Clearly, the research community believes that the 
information obtained from these chips is useful (i.e., that it is more likely than not informative of 
the protein level). 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is a correlation between polypeptide and 
mRNA levels, these instances are exceptions rather than the rule. In the majority of amplified 
genes , the teachings in the art, as exemplified by Orntoft et al, Hyman et al y Pollack et aL, and 
the Polakis Declaration, overwhelmingly show that gene amplification influences gene 
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expression at the mRNA and protein levels. Thus, one of skill in the art would reasonably expect 
in this instance, based on the amplification data for the PRO 1293 gene, that the PRO 1293 
polypeptide is concomitantly overexpressed. Accordingly, Appellants submit that the PRO 1293 
polypeptides and nucleic acids have utility in the diagnosis of cancer and based on such a utility, 
one of skill in the art would know exactly how to use the claimed polypeptides for diagnosis of 
cancer. 

In the Office Action mailed November 29, 2004, the Examiner asserted that "Orntoft et 
al do not appear to look at gene amplification, mRNA levels and polypeptide levels from a 
single gene at a time. . .. Orntoft et al concentrated on regions of chromosomes with strong gains 
of chromosomal material containing clusters of genes (p.40). This analysis was not done for 
PRO 1293 in the instant specification. That is, it is not clear whether or not PRO 1293 is in a gene 
cluster in a region of a chromosome that is highly amplified. Therefore, the relevance, if any of 
Orntoft et al is not clear." (Page 8 of the Office Action mailed November 29, 2004). The 
Examiner further alleges, "Hyman et al used the same CGH approach in their research. Less 
than half (44%) of highly amplified genes showed mRNA overexpression (abstract) .... 
Therefore, Hyman et al also do not support utility of the polypeptides of the instant invention." 
(Page 8 of the Office Action mailed November 29, 2004). The Examiner further alleges that 
"Pollack et al also used CGH technology, concentrating on large chromosome regions showing 
high amplification (p. 12965). Pollack et al did not investigate polypeptide levels." (Pages 8-9 
of the Office Action mailed November 29, 2004). 

Appellants respectfully point out that in Orntoft et al, 1,800 genes that yielded an 
increase or decrease in mRNA expression in two invasive tumors compared to the two non- 
invasive papillomas were then mapped to chromosomal locations. The chromosomes had 
already been analyzed for amplification by hybridizing tumor DNA to normal metaphase 
chromosomes (CGH). Orntoft et al used CGH alterations as the independent variable and 
estimated the frequency of expression alterations of. the 1,800 genes in the chromosomal areas. 
Orntoft et al found that in general (77% and 80% concordance) areas with a strong gain of 
chromosomal material contained a cluster of genes having increased mRNA expression (see page 
40). Orntoft et al state, "For both tumors TCC733 (p<0.015) and TCC827 (p<0.00003) a highly 
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significant correlation was observed between the level of CGH ratio change (reflecting the DNA 
copy number) and alterations detected by the array based technology" (see page 41, column 1). 
Orntoft et al., also studied the relation between altered mRNA and protein levels using ID- 
PAGE analysis. Orntoft et al state, "In general there was a highly significant correlation 
(p<0.005) between mRNA and protein alterations.. . . 26 well focused proteins whose genes had 
a known chromosomal location were detected in TCCs 733 and 335, and of these 19 correlated 
(p<0.005) with the mRNA changes detected using the arrays.' 1 (See page 42, column 2 to page 
34, column 2). Accordingly, Orntoft et al clearly support Appellants 1 position that proteins 
expressed by genes that are amplified in tumors are useful as cancer markers. 

The Examiner has stated that Appellants have not indicated whether PRO 1293 is in a 
gene cluster region of a chromosome. (Page 8 of the Office Action mailed November 29, 2004). 
Appellants fail to see how this is relevant to the analysis. Orntoft et al did not limit their 
findings to only those regions of amplified gene clusters. Further, as discussed below, Hyman et 
al and Pollack et al did gene-by-gene analysis across all chromosomes. 

Appellants respectfully submit that the Examiner has mischaracterized the methods used 
by Hyman et al and Pollack et al in their analysis. These papers did not use traditional CGH 
analysis to identify amplified genes. In Hyman et al, 13,824 cDNA clones were placed on glass 
slides in a microarray and genomic DNA from breast cancer cell lines and normal human WBCs 
was hybridized to the cDNA sequences. For expression analysis, RNA from tumor cell lines was 
hybridized on the same microarrays. The 13,824 arrayed cDNA clones were analyzed for gene 
expression and gene copy number in 14 breast cancer cell lines. Hyman et al state, "The results 
illustrate a considerable influence of copy number on gene expression patterns." For example, 
Hyman et al teach that "[u]p to 44% of the highly amplified transcripts (CGH ratio, >2.5) were 
overexpressed (i.e., belonged to the global upper 7% of expression ratios) compared with only 
6% for genes with normal copy number." (See page 6242, column 1). Further, Hyman et al 
state that "[t]he cDNA/CGH microarray technique enables the direct correlation of copy number 
and expression data on a gene-by-gene basis throughout the genome." (See page 6242, column 
2). Therefore, the analysis performed by Hyman et al was on a gene-by gene basis, and clearly 
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shows that "it is more likely than not" that a gene which is amplified in tumor cells will have 
increased gene expression. 

In Pollack et al, DNA copy number alteration across 6,691 mapped human genes in 44 
predominantly advanced primary breast tumors and 10 breast cancer cell lines was profiled. 
Pollack et al further state, "Parallel microarray measurements of mRNA levels reveal the 
remarkable degree to which variation in gene copy number contributes to variation in gene 
expression in tumor cells." (See Abstract). "Genome-wide, of 1 17 high-level DNA 
amplifications (fluorescence ratios >4, and representing 91 different genes), 62% (representing 
54 different genes; . . .) are found associated with at least moderately elevated mRNA levels 
(mean-centered fluorescence ratios >2), and 42% (representing 36 different genes) are found 
associated with comparably highly elevated mRNA levels (mean-centered fluorescence ratios 
>4)." (See page 12966, column 1). Therefore, the analysis performed by Pollack et al was also 
on a gene-by gene basis, and clearly shows that "it is more likely than not" that a gene which is 
amplified in tumor cells will have increased gene expression. 

The Examiner further asserts that "none of the three papers reported that the research was 
relevant to identifying probes that can be used as cancer diagnostics" (Page 9 of the Office 
Action mailed November 29, 2004). Appellants respectfully point out that Hyman et al 
conducted additional studies of one of the genes found to be amplified, HOXB7, and found "a 
clinical association between HOXB7 amplification and poor patient prognosis." (Page 6244, 
col.l to col.2). Thus the results of Hyman et al confirm that genes which are amplified in 
tumors have prognostic utility . The Board's attention is also respectfully directed to the final 
paragraph of Pollack et al, wherein the authors conclude that "a substantial portion of the 
phenotypic uniqueness (and, by extension, the heterogeneity in clinical behavior) among patients 1 
tumors may be traceable to underlying variation in DNA copy number." (Page 12698, col. 2). 
Accordingly, Pollack et al confirm that genes that are amplified in at least one type of tumor are 
useful as markers for that type of tumor, and for prognostic uses directed to that type of tumor. 

With regard to the correlation between mRNA expression and protein levels, the 
Examiner has asserted that the Polakis Declaration is insufficient to overcome the rejection of 
claims 28-36 and 38-40 since it is limited to a discussion of data regarding the correlation of 
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mRNA levels and polypeptide levels and not gene amplification levels. The Examiner further 
asserted that the declaration does not provide data such that the Examiner can independently 
draw conclusions. (Page 10 of the Office Action mailed November 29, 2004). 

Appellants submit that Dr. Polakis 1 Declaration was presented to support the position that 
there is a correlation between mRNA levels and polypeptide levels, the correlation between gene 
amplification and mRNA levels having already been established by the data shown in the Orntoft 
et al, Hyman et al, and Pollack et al articles. Appellants emphasize that the opinions expressed 
in the Polakis Declaration, including the quoted statement, are all based on factual findings. 
Thus, Dr. Polakis explains that in the course of their research using microarray analysis, he and 
his co-workers identified approximately 200 gene transcripts that are present in human tumor 
cells at significantly higher levels than in corresponding normal human cells. Subsequently, 
antibodies binding to about 30 of these tumor antigens were prepared, and mRNA and protein 
levels were compared. In approximately 80% of the cases, the researchers found that increases in 
the level of a particular mRNA correlated with changes in the level of protein expressed from 
that mRNA when human tumor cells are compared with their corresponding normal cells. Dr. 
Polakis' statement that "an increased level of mRNA in a tumor cell relative to a normal cell 
typically correlates to a similar increase in abundance of the encoded protein in the tumor cell 
relative to the normal cell" is based on factual, experimental findings, clearly set forth in the 
Declaration. Accordingly, the Declaration is not merely conclusive, and the fact-based 
conclusions of Dr. Polakis would be considered reasonable and accurate by one skilled in the art. 

The case law has clearly established that in considering affidavit evidence, the Examiner 
must consider all of the evidence of record anew. 19 "After evidence or argument is submitted by 
the applicant in response, patentability is determined on the totality of the record, by a 

20 

preponderance of the evidence with due consideration to persuasiveness of argument" 
Furthermore, the Federal Court of Appeals held in In re Alton, "We are aware of no reason why 



19 In re Rinehart, 531 F.2d 1084, 189 U.S.P.Q. 143 (C.C.P.A. 1976) and/n re Piasecki, 745 F.2d. 1015, 
226 U.S.P.Q. 881 (Fed. Cir. 1985). 

20 In re Alton, 37 U.S.P.Q.2d 1578 1584 (Fed. Cir 1966) (quoting//! re Oetiker, 977 F.2d 1443, 1445, 24 
U.S.P.Q.2d 1443, 1444 (Fed. Cir. 1992)). 
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opinion evidence relating to a fact issue should not be considered by an examiner" 21 . Appellants 
also respectfully draw the Examiner's attention to the Utility Examination Guidelines 22 which 
states, "Office personnel must accept an opinion from a qualified expert that is based upon 
relevant facts whose accuracy is not being questioned; it is improper to disregard the opinion 
solely because of a disagreement over the significance or meaning of the facts offered." The 
statement in question from an expert in the field (the Polakis Declaration) states that "it is my 
considered scientific opinion that for human genes, an increased level of mRNA in a tumor cell 
relative to a normal cell typically correlates to a similar increase in abundance of the encoded 
protein in the tumor cell relative to the normal cell" Therefore, barring evidence to the contrary 
regarding the above statement in the Polakis Declaration, this rejection is improper under both 
the case law and the Utility guidelines. 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is a correlation between polypeptide and 
mRNA levels, these instances are exceptions rather than the rule. In the majority of amplified 
genes , the teachings in the art, as exemplified by Orntoft et al y Hyman et al, Pollack et ai, and 
the Polakis Declaration, overwhelmingly show that gene amplification influences gene 
expression at the mRNA and protein levels. Therefore, one of skill in the art would reasonably 
expect in this instance, based on the amplification data for the PRO 1293 gene, that the PRO 1293 
polypeptide is concomitantly overexpressed. Thus, Appellants submit that the claimed PRO 1293 
polypeptides have utility in the diagnosis of cancer. 

E. Even if a prima facie case of lack of utility has been established, it should be 
withdrawn on consideration of the totality of evidence 

Even if one assumes arguendo that it is more likely than not that there is no correlation 
between gene amplification and increased mRNA/protein expression, which Appellants submit is 
not true, a polypeptide encoded by a gene that is amplified in cancer would still have a specific, 
substantial, and credible utility. In support, Appellants respectfully draw the Board's attention to 

21 In re Alton, supra. 

22 Part IIB, 66 Fed. Reg. 1098 (2001). 
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page 2 of the Declaration of Dr. Avi Ashkenazi (submitted with the Response filed September 9, 
2004) which explains that, 

even when amplification of a cancer marker gene does not result in significant 
over-expression of the corresponding gene product, this very absence of gene 
product over-expression still provides significant information for cancer diagnosis 
and treatment. Thus, if over-expression of the gene product does not parallel gene 
amplification in certain tumor types but does so in others, then parallel monitoring 
of gene amplification and gene product over-expression enables more accurate 
tumor classification and hence better determination of suitable therapy. In 
addition, absence of over-expression is crucial information for the practicing 
clinician. If a gene is amplified but the corresponding gene product is not over- 
expressed, the clinician accordingly will decide not to treat a patient with agents 
that target that gene product. 

Appellants thus submit that simultaneous testing of gene amplification and gene product 
over-expression enables more accurate tumor classification, even if the gene-product, the protein, 
is not over-expressed. This leads to better determination of a suitable therapy. Further, as 
explained in Dr. Ashkenazi's Declaration, absence of over-expression of the protein itself is 
crucial information for the practicing clinician. If a gene is amplified in a tumor, but the 
corresponding gene product is not over-expressed, the clinician will decide not to treat a patient 
with agents that target that gene product. This not only saves money, but also has the benefit that 
the patient can avoid exposure to the side effects associated with such agents. 

This utility is further supported by the teachings of the article by Hanna and Mornin. 
(Pathology Associates Medical Laboratories, August (1999); submitted with the Response filed 
September 9, 2004). The article teaches that the HER-2/neu gene has been shown to be 
amplified and/or over-expressed in 10%-30% of invasive breast cancers and in 40%-60% of 
intraductal breast carcinomas. Further, the article teaches that diagnosis of breast cancer includes 
testing both the amplification of the HER-2/neu gene (by FISH) as well as the over-expression of 
the HER-2/neu gene product (by IHC). Even when the protein is not over-expressed, the assay 
relying on both tests leads to a more accurate classification of the cancer and a more effective 
treatment of it. 

The Examiner has asserted that "Hanna et al. supports the rejection, in that Hanna et al. 
show that gene amplification does not reliably correlate with protein over-expression, and thus 
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the level of polypeptide expression must be tested empirically." (Page 7 of the Office Action 
mailed November 29, 2004). Appellants respectfully point out that the Examiner appears to have 
misread Hanna et al Hanna et al. clearly state that gene amplification (as measured by FISH) 
and polypeptide expression (as measured by immunohistochemistry, IHC) are well correlated 
("in general, FISH and IHC results correlate well" (Hanna et al p. 1, col. 2)). It is only a subset 
of tumors which show discordant results. Thus Hanna et al support Appellants' position that it is 
more likely than not that gene amplification correlates with increased polypeptide expression. 

Appellants have clearly shown that the gene encoding the PRO 1293 polypeptide is 
amplified in at least three lung and colon tumors. Therefore, the PRO 1293 gene, similar to the 
HER-2/neu gene disclosed in Hanna et al., is a tumor associated gene. Furthermore, as discussed 
above, in the majority of amplified genes, the teachings in the art overwhelmingly show that gene 
amplification influences gene expression at the mRNA and protein levels. Therefore, one of skill 
in the art would reasonably expect in this instance, based on the amplification data for the 
PRO 1293 gene, that the PRO 1293 polypeptide is concomitantly overexpressed. 

However, even if gene amplification does not result in overexpression of the gene product 
(i.e., the protein) an analysis of the expression of the protein is useful in determining the course 
of treatment, as supported by the Ashkenazi Declaration and the Hanna paper. The Examiner 
"agrees that evidence regarding lack of over-expression would be useful" but asserts that "there is 
no evidence as to whether the gene products (such as the polypeptide) are over-expressed or not 
in the instant invention" and that "[f]urther research is required to determine such." (Page 7 of 
the Office Action mailed November 29, 2004). The Examiner appears to view the testing 
described in the Ashkenazi Declaration and the Hanna paper as experiments involving further 
characterization of the PRO 1293 polypeptide itself. In fact, such testing is for the purpose of 
characterizing not the PRO 1293 polypeptide, but the tumors in which the gene encoding 
PRO 1293 is amplified. The PRO 1293 polypeptide is therefore useful in tumor categorization, 
the results of which become an important tool in the hands of a physician enabling the selection 
of a treatment modality that holds the most promise for the successful treatment of a patient. 

For the reasons given above, Appellants respectfully submit that the present specification 
clearly describes, details and provides a patentable utility for the claimed invention. 

-27- 

Appeal Brief 
Application Serial No. 10/006,063 
Attorney's Docket No. 39780-2830 P1C3 



Accordingly, Appellants respectfully request reconsideration and reversal of the rejections of 
Claims 28-36 and 38-40 under 35 U.S.C. §101. 

ISSUE II: Claims 28-36 and 38-40 satisfy the enablement requirement of 35 U.S.C. §112, 
first paragraph. 

Claims 28-36 and 38-40 stand rejected under 35 U.S.C. §112, first paragraph, allegedly 
"since the claimed invention is not supported by either a specific and substantial asserted utility or a 
well established utility for the reasons set forth above, one skilled in the art clearly would not know 
how to use the claimed invention." (Page 4 of the Office Action mailed November 29, 2004). 

In this regard, Appellants refer to the arguments and information presented above in 
response to the outstanding rejection under 35 U.S.C. § 101, wherein those arguments are 
incorporated by reference herein. Appellants respectfully submit that as described above, the 
PRO 1293 polypeptides have utility in the diagnosis of cancer and based on such a utility, one of 
skill in the art would know exactly how to use the claimed polypeptides for diagnosis of cancer, 
without undue experimentation. 

The Examiner has further asserted that even if Appellants established an activity for the 
polypeptide of SEQ ID NO:77, 

Due to the large quantity of experimentation necessary to determine all the 
polypeptides comprising an amino acid that is at least 80%, 85%, 90%, 95% or 99% 
identical to the polypeptide of SEQ ID NO:77, and to screen an activity for them, 
the lack of direction/guidance presented in the specification regarding which 
variants of the polypeptide of SEQ ID NO:77 would retain the desired activity. . . 
the unpredictability of the effects of mutation on the structure and function of the 
claimed polypeptide, and the breadth of the claims which fail to recite particular 
biological activities, undue experimentation would be required of the skilled artisan 
to make and/or use the claimed invention in its full scope. (Pages 1 1-12 of the 
Office Action mailed November 29, 2004). 

Appellants note that the claimed variants all share the functional limitation that " the 
nucleic acid encoding said polypeptide is amplified in lung or colon tumors ." Thus the claims 
recite "particular biological activities" as required in the Office Action. Appellants note that 
since the recited activity is that of the encoding nucleic acids, consideration of "the effects of 
mutation on the structure and function of the claimed polypeptide" are not relevant. As discussed 
above, under Issue I concerning the rejection under 35 U.S.C. § 101, polypeptides wherein the 
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encoding nucleic acid is amplified in tumor tissue are themselves more likely than not to be over- 
expressed in tumor tissues; thus they have utility as markers for those tumor types in which they 
are over-expressed. Further, as discussed above, under Issue I concerning the rejection under 35 
U.S.C. § 101, simultaneous testing of gene amplification and gene product over-expression 
enables more accurate tumor classification, even if the gene-product, the protein, is not over- 
expressed. This leads to better determination of a suitable therapy for the tumor. Accordingly, 
one of ordinary skill in the art would understand how to use polypeptide variants wherein the 
nucleic acid encoding said polypeptide is amplified in lung or colon tumors in the diagnosis or 
classification of cancer. 

Example 143 of the present application provides step-by-step guidelines and protocols for 
the gene amplification assay. By following the disclosure in the specification, one skilled in the 
art can easily test whether a gene encoding a variant PRO 1293 protein is amplified in lung or 
colon tumors. The specification further describes methods for the determination of percent 
identity between two amino acid sequences. (See page 302, line 4, to page 305, line 4). In fact, 
the specification teaches specific parameters to be associated with the term "percent identity" as 
applied to the present invention. The specification further provides detailed guidance as to 
changes that may be made to a PRO polypeptide without adversely affecting its activity (page 
354, line 30 to page 357, line 7). This guidance includes a listing of exemplary and preferred 
substitutions for each of the twenty naturally occurring amino acids (Table 6, page 356). 
Accordingly, one of skill in the art could identify whether the variant PRO 1293 sequence falls 
within the parameters of the claimed invention. Once such an amino acid sequence is identified, 
the specification sets forth methods for making the amino acid sequences (see page 354, line 30 
to page 358, line 34) and methods of preparing the PRO polypeptides (see page 358, line 35 and 
onward). 

Therefore, Appellants respectfully submit that the specification provides ample guidance 
such that one of skill in the art could readily test a nucleic acid sequence which encodes a variant 
polypeptide to determine whether it is amplified by the methods set forth in Example 143. 
Furthermore, one of ordinary skill in the art has a sufficiently high level of technical competence 
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to identify sequences with at least 80% identity to SEQ ID NO:77. Accordingly, one of ordinary 
skill could practice the claimed invention without undue experimentation. 

The claims currently recite polypeptide sequences associated with a biological activity of 
the encoding polynucleotides. This biological activity together with the well defined relatively 
high degree of sequence identity and general knowledge in the art at the time the invention was 
made, sufficiently defines the claimed genus such that, one skilled in the art, at the effective date 
of the present application, would have known how to make and use the claimed polypeptide 
sequences without undue experimentation. As the M.P.E.P. states, "[t]he fact that 
experimentation may be complex does not necessarily make it undue, if the art typically engages 
in such experimentation." 23 

As discussed above, a considerable amount of experimentation is permissible, if it is 
merely routine. Appellants submit that the identification of variant PRO 1293 polypeptides 
having at least 80% identity to SEQ ID NO:77 wherein the polynucleotide encoding the 
polypeptide is amplified in lung or colon tumors, can be performed by techniques that were well 
known in the art at the priority date of this application, and that the performance of such work 
does not require undue experimentation. 

Accordingly, Appellants respectfully request reconsideration and reversal of the 
enablement rejection of Claims 28-36 and 38-40 under 35 U.S.C. §112, first paragraph. 

ISSUE III: Claims 28-32 satisfy the written description requirement of 35 U.S.C §112, 
First Paragraph 

Claims 28-32 stand rejected under 35 U.S.C. §112, first paragraph as allegedly lacking 
adequate written description. In particular, the Examiner has asserted that "[although the 
specification describes the structure of PRO 1293 polypeptide, the skilled artisan would not be 
able to visualize the structure of the polypeptides having at least at least 80%, 85%, 90%, 95% or 
99% identity with the polypeptide of SEQ ID NO:77, because the claims are not described by 
structure and functional identity/ 1 (Page 12 of the Office Action mailed November 29, 2004). 

23 M.P.E.P. §2164.01 citing In re Certain Limited-charge Cell Culture Microcarriers, 221 U.S.P.Q. 1 165, 
1 174 (Infl Trade Comm'n 1983), off 1 sub nom. Massachusetts Institute of Technology v. A.B. Fortia, 774 F.2d 1 104, 
227 U.S.P.Q. 428 (Fed. Cir. 1985). 
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Currently pending Claims 28-32 recite the functional limitation that the nucleic acid 
encoding the polypeptide is amplified in lung or colon tumors. Accordingly, coupled with the 
general knowledge available in the art at the time of the invention, Appellants submit that the 
specification provides ample written support for the claimed polypeptides in Example 143, where 
methods of detecting and quantifying amplification in several tumors and/or cell lines are 
described. Thus, based on the high percentage of sequence identity and the described method of 
detecting and quantifying amplification in tumors, one skilled in the art would have known at the 
time of the invention that the Appellants had possession of the claimed polypeptides. 

A. The Legal Test for Written Description 

The well-established test for sufficiency of support under the written description 
requirement of 35 U.S.C. §112, first paragraph is "whether the disclosure of the application as 
originally filed reasonably conveys to the artisan that the inventor had possession at that time of 
the later claimed subject matter, rather than the presence or absence of literal support in the 
specification for the claim language." 24,25 The adequacy of written description support is a 
factual issue and is to be determined on a case-by-case basis. The factual determination in a 
written description analysis depends on the nature of the invention and the amount of knowledge 
imparted to those skilled in the art by the disclosure. 27,28 

la Environmental Designs, Ltd. v. Union Oil Co., 29 , the Federal Circuit held, "Factors 
that may be considered in determining level of ordinary skill in the art include (1) the educational 
level of the inventor; (2) type of problems encountered in the art; (3) prior art solutions to those 
problems; (4) rapidity with which innovations are made; (5) sophistication of the technology; and 



24 InreKaslow, 707 F.2d 1366, 1374, 212 U.S.P.Q. 1089, 1096 (Fed. Cir. 1983). 

25 See also Vas-Cath, Inc. v. Mahurkar, 935 F.2d at 1563, 19 U.S.P.Q.2d at 1 1 16 (Fed. Cir. 1991). 

26 See e.g., Vas-Cath, 935 F.2d at 1563; 19 U.S.P.Q.2d at 1 1 16. 

27 Union Oil v. Atlantic Richfield Co., 208 F.2d 989, 996 (Fed. Cir. 2000). 

28 See also M.P.E.P. §2163 11(A). 

29 713 F.2d 693, 696, 218 U.S.P.Q. 865, 868 (Fed. Cir. 1983), cert, denied, 464 U.S. 1043 (1984). 
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(6) educational level of active workers in the field." (Emphasis added). Further, The 
"hypothetical 'person having ordinary skill in the art' to which the claimed subject matter pertains 
would, of necessity have the capability of understanding the scientific and engineering principles 

applicable to the pertinent art ." 31 ' 32 

B. The Disclosure Provides Sufficient Written Description for the Claimed 
Invention 

Appellants respectfully submit that the instant specification evidences the actual 
reduction to practice of the amino acid sequence of SEQ ID NO:77. TheExaminer has 
acknowledged that polypeptides comprising the sequence set forth in SEQ ID NO:77 meet the 
written description provision of 35 U.S.C. §112, first paragraph. (Page 12 of the Office Action 
mailed November 29, 2004). Thus, the genus of polypeptides with at least 80% sequence 
identity to SEQ ID NO:77, which possess the functional property of having a nucleic acid which 
is amplified in lung or colon tumors would meet the requirement of 35 U.S.C. §112, first 
paragraph, as providing adequate written description. 

Appellants have provided native PRO sequence SEQ ID NO:77. The present application 
also describes methods for identifying genes which are amplified in lung or colon tumors. 

Example 143 of the present application provides step-by-step guidelines and protocols for 
the gene amplification assay. By following the disclosure in the specification, one skilled in the 
art can easily test whether a gene encoding a variant PRO 1293 protein is amplified in lung or 
colon or tumors. The specification further describes methods for the determination of percent 
identity between two amino acid sequences. (See page 302, line 4 to page 305, line 4). In fact, 
the specification teaches specific parameters to be associated with the term "percent identity" as 
applied to the present invention. The specification further provides detailed guidance as to 
changes that may be made to a PRO polypeptide without adversely affecting its activity (page 
354, line 30 to page 357, line 7). This guidance includes a listing of exemplary and preferred 

30 See also M.P.E.P. §2141.03. 

31 Ex parte Hiyamizu, 10 U.S.P.Q.2d 1393, 1394 (Bd. Pat. App. & Inter. 1988) (emphasis added). 

32 See ato M.P.E.P. §2141.03. 
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substitutions for each of the twenty naturally occurring amino acids (Table 6, page 356). 
Accordingly, one of skill in the art could identify whether the variant PRO 1293 sequence falls 
within the parameters of the claimed invention. Once such an amino acid sequence was 
identified, the specification sets forth methods for making the amino acid sequences (see page 
354, line 30 to page 358, line 34) and methods of preparing the PRO polypeptides (see page 185, 
line 36 and onward). 

Therefore, Appellants respectfully submit that one of skill in the art could readily test a 
nucleic acid sequence which encodes a variant polypeptide to determine whether it is amplified 
by the methods set forth in Example 143. 

The Examiner has asserted that "although the claims recite both percent identity and 
functional language, the recited function is for the nucleic acid encoding the polypeptide of SEQ 
ID NO: 77, and not for the polypeptide itself. The specification does not disclose a function for 
the polypeptide of SEQ ID NO:77, neither does the specification disclose a variant of the 
polypeptide of SEQ ID NO:77 that displays an activity." (Page 12 of the Office Action mailed 
November 29, 2004). In this regard, Appellants refer to the arguments and information presented 
above in response to the outstanding rejections under 35 U.S.C. §101 and 35 U.S.C. §112, first 
paragraph, for alleged lack of utility and enablement. These arguments are incorporated by 
reference herein. Appellants respectfully submit that as discussed above under Issue I, the 
teachings in the art, as exemplified by Orntoft et ai, Hyman et al, Pollack et al, and the Polakis 
Declaration, overwhelmingly show that gene amplification influences gene expression at the 
mRNA and protein levels. Thus the amplification of the encoding polynucleotide in tumors does 
provide useful information regarding the functional property of the polypeptide in being 
overexpressed in tumor tissues. 

Appellants further respectfully submit that whether or not the polypeptide is also 
overexpressed in tumor tissues is irrelevant to the consideration of adequate written description. 
The claims have characterized the recited polypeptides as having the property that their encoding 
polynucleotides are amplified in lung or colon tumors. As discussed above, the specification 
describes methods for identifying genes which are amplified in lung or colon tumors. Therefore, 
one of skill in the art could readily test a nucleic acid sequence which encodes a variant 
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polypeptide to determine whether it is amplified by the methods set forth in Example 143. Thus, 
the recited property of amplification of the encoding gene adds to the characterization of the 
claimed polypeptide sequences in a manner that one of skill in the art could readily assess and 
understand. 

As discussed above, Appellants have recited structural features, namely, 80% sequence 
identity to SEQ ID NO: 77, which are common to the genus. Appellants have also provided 
guidance as to how to make the recited variants of SEQ ID NO:77, including listings of 
exemplary and preferred sequence substitutions. The genus of claimed polypeptides is further 
defined by having a specific functional activity for the encoding nucleic acids. Accordingly, a 
description of the claimed genus has been achieved. 

For the above reasons, the specification provides adequate written description for 
polypeptides having at least 80% identity to SEQ ID NO: 77 wherein the nucleic acid encoding 
the polypeptide is amplified in lung or colon tumors. Accordingly, Appellants respectfully 
request reconsideration and reversal of the written description rejection of Claims 28-32 under 35 
U.S.C. §112, first paragraph. 

ISSUE IV: Claims 28-36 and 38-40 are not anticipated under 35 U.S.C. S102(a) bv Botstein 
et al. WO 2000053751 or Baker et ah. WO 200012708. 

Claims 28-36 and 38-40 stand rejected under 35 U.S.C. §102(a) as being anticipated by 
Botstein et aL, WO200053751, published on September 14, 2000, and by Baker et al. 9 
WO200012708, published on March 9, 2000. 

Appellants submit that, as discussed above in response to the outstanding rejections under 
35 U.S.C. §101 and 35 U.S.C. §112, first paragraph, for alleged lack of utility and enablement 
(Issue I and Issue II), Appellants rely on the gene amplification results (Example 143) to establish 
a credible, substantial and specific asserted utility for the polypeptide PRO 1293. These results 
were first disclosed in U.S. Provisional Application Serial No. 60/162,506, filed on October 29, 
1999. As discussed above, the disclosure of the instant application, which is similar to that of the 
earlier-filed application (U.S. Provisional Application Serial No. 60/162,506), provides the 
support required under 35 U.S.C. §112 for the subject matter of the instant claims. Accordingly, 
Appellants submit that the subject matter of the instant claims is disclosed in the manner 
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provided by 35 U.S.C. §112 in U.S. Provisional Application Serial No. 60/162,506. Therefore, 
the effective filing date of this application is October 29, 1999, the filing date of U.S. Provisional 
Application Serial No. 60/162,506. 

The PCT patent application by Botstein et al, WO200053751, was published on 
September 14, 2000, which is over ten months after the effective filing date of the instant 
application; hence Botstein et al. is not prior art. 

The PCT patent application by Baker et al, WO2000 12708, was published on March 9, 
2000, which is over four months after the effective filing date of the instant application; hence 
Baker et al is not prior art. 

The Examiner has asserted that the subject matter of the claimed invention "is not 
supported by the disclosure in. . .60/162,506, filed October 29, 1999, since the prior application 
does not provide a specific and substantial utility or a well established utility for the claimed 
invention." The Examiner has further asserted that "the increased copy number of PRO 1293 
DNA in said tumors, does not provide a readily apparent use for the polypeptide of SEQ ID 
NO:77, because the assay does not show that the polypeptide is also amplified in these tumors." 
(Pages 2-3 of the Office Action mailed November 29, 2004). 

In this regard, Appellants refer to the arguments and information presented above in 
response to the outstanding rejections under 35 U.S.C. §101 and 35 U.S.C. §112, first paragraph, 
for alleged lack of utility and enablement. These arguments are incorporated by reference herein. 
Appellants respectfully submit that as described above under Issue I, the presently claimed 
invention is supported by a specific, substantial and credible utility and, therefore, the present 
specification teaches one of ordinary skill in the art "how to use" the claimed invention without 
undue experimentation, as described above. 

Accordingly, Appellants respectfully request reconsideration and reversal of the rejection 
of Claims 28-36 and 38-40 under 35 U.S.C. § 102(b) as being anticipated by Botstein et al or 
Baker et al 
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CONCLUSION 

For the reasons given above, Appellants submit that the specification discloses at least 
one patentable utility for the PR01293 polypeptides of Claims 28-36 and 38-40, and that one of 
ordinary skill in the art would understand how to used the claimed polypeptides, for example in 
the diagnosis of lung and colon tumors. Therefore, claims 28-36 and 38-40 meet the 
requirements of 35 U.S.C. §101 and 35 U.S.C. §112, first paragraph. Further, this patentable 
utility for the claimed polypeptides was first disclosed in U.S. Provisional Application Serial No. 
60/162,506, filed on October 29, 1999, priority to which is claimed in the instant application. 
Accordingly, the instant application has an effective priority date of October 29, 1999, and 
therefore Botstein et al, WO200053751, published on September 14, 2000, and Baker et al., 
WO200012708, published on March 9, 2000, are not prior art and do not anticipate the claims 
under 35 U.S.C. § 102(a). 

Appellants further submit that the recited polypeptide variants of claims 28-32 meet the 
written description requirement of 35 U.S.C. § 1 12, first paragraph. 

Accordingly, reversal of all the rejections of claims 28-36 and 38-40 is respectfully 
requested. 

Please charge any additional fees, including fees for additional extension of time, or 
credit overpayment to Deposit Account No. 08-1641 (referencing Attorney's Docket 
No. 39780-2830 P1C3) . 

Respectfully submitted, 

Date: November 22, 2005 By: 

Barrie D. Greene (Reg. No. 46,740) 

HELLER EHRMAN LLP 

275 Middlefield Road 
Menlo Park, California 94025-3506 
Telephone: (650)324-7000 
Facsimile: (650) 324-0638 
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8. CLAIMS APPENDIX 

Claims on Appeal 

28. An isolated polypeptide having at least 80% amino acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:77; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:77, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the extracellular domain of the polypeptide of SEQ ID 
NO:77;or 

(d) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203292; 

wherein the nucleic acid encoding the polypeptide is amplified in lung or colon tumors. 

29. The isolated polypeptide of Claim 28 having at least 85% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:77; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:77, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the extracellular domain of the polypeptide of SEQ ID 
NO:77; or 

(d) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203292; 

wherein the nucleic acid encoding the polypeptide is amplified in lung or colon tumors. 

30. The isolated polypeptide of Claim 28 having at least 90% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ED NO: 77; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO: 77, lacking its 
associated signal peptide; 
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(c) the amino acid sequence of the extracellular domain of the polypeptide of SEQ ID 
NO:77; or 

(d) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203292; 

wherein the nucleic acid encoding the polypeptide is amplified in lung or colon tumors. 

3 1 . The isolated polypeptide of Claim 28 having at least 95% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:77; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO: 77, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the extracellular domain of the polypeptide of SEQ ID 

NO:77; or 

(d) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203292; 

wherein the nucleic acid encoding the polypeptide is amplified in lung or colon tumors. 

32. The isolated polypeptide of Claim 28 having at least 99% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO: 77; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:77, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the extracellular domain of the polypeptide of SEQ ID 

NO:77; or 

(d) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203292; 

wherein the nucleic acid encoding the polypeptide is amplified in lung or colon tumors. 

33. An isolated polypeptide comprising: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO: 77; 
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(b) the amino acid sequence of the polypeptide of SEQ ID NO:77, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the extracellular domain of the polypeptide of SEQ ID 

NO:77; or 

(d) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203292. 

34. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:77. 

35. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:77, lacking its associated signal peptide. 

36. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
extracellular domain of the polypeptide of SEQ ID NO:77. 

38. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 203292. 

39. A chimeric polypeptide comprising a polypeptide according to Claim 28 fused to 
a heterologous polypeptide. 

40. The chimeric polypeptide of Claim 39, wherein said heterologous polypeptide is 
an epitope tag or an Fc region of an immunoglobulin. 
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9. EVIDENCE APPENDIX 

1 . Declaration of Audrey D. Goddard, Ph.D. under 37 C.F.R. § 1 . 1 32, with attached Exhibits 
A-G: 

A. Curriculum Vitae of Audrey D. Goddard, Ph.D. 

B. Higuchi, R. et al., "Simultaneous amplification and detection of specific DNA 
sequences," Biotechnology 10:413-417 (1992). 

C. Livak, K.J., et al., "Oligonucleotides with fluorescent dyes at opposite ends provide a 
quenched probe system useful for detecting PCR product and nucleic acid hybridization," PCR 
Methods Appl 4:357-362 (1995). 

D. Heid, C.A. et al., "Real time quantitative PCR," Genome Res, 6:986-994 (1996). 

E. Pennica, D. et al., "WISP genes are members of the connective tissue growth factor 
family that are up-regulated in Wnt-1 -transformed cells and aberrantly expressed in human colon 
tumors," Proc. Natl Acad. Set USA <ll r -14722 (1998). 

F. Pitti, R.M. et al., "Genomic amplification of a decoy receptor for Fas ligand in lung and 
colon cancer," Nature 396:699-703 (1998). 

G. Bieche, I. et al., "Novel approach to quantitative polymerase chain reaction using real- 
time detection: Application to the detection of gene amplification in breast cancer," Int. J. Cancer 
78:661-666(1998). 

2. Declaration of Paul Polakis, Ph.D. under 37 C.F.R. §1.132. 

3. Declaration of Avi Ashkenazi, Ph.D. under 37 C.F.R. §1.132; with attached Exhibit A 
(Curriculum Vitae). 

4. Orntoft, T.F., et al., "Genome-wide Study of Gene Copy Numbers, Transcripts, and 
Protein Levels in Pairs of Non-Invasive and Invasive Human Transitional Cell Carcinomas," 
Molecular & Cellular Proteomics 1:37-45 (2002). 

5. Hyman, E., et al., "Impact of DNA Amplification on Gene Expression Patterns in Breast 
Cancer," Cancer Research 62:6240-6245 (2002). 
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6. Pollack, J.R., et al., "Microarray Analysis Reveals a Major Direct Role of DNA Copy 
Number Alteration in the Transcriptional Program of Human Breast Tumors " Proc. Natl. Acad. 
Sci. USA 99:12963-12968 (2002). 

7. Hanna, J.S., et al., "HER-2/neu Breast Cancer Predictive Testing," Pathology Associates 
Medical Laboratories (1999). 

8. Pennica, D. et al., "WISP genes are members of the connective tissue growth factor family 
that are up-regulated in Wnt-1 -transformed cells and aberrantly expressed in human colon tumors," 
Proc. Natl. Acad. Sci. USA 95:14717-14722 (1998). 

9. Hu, Y. et al, "Analysis of genomic and proteomic data using advanced literature mining," 
Journal ofProteome Research 2:405-412 (2003). 

Items 1-3 were submitted with Appellants 1 Response filed September 9, 2004, and made of record 
by the Examiner in the Office Action mailed November 29, 2004. 

Items 4-7 were made of record by Appellants in their IDS filed September 9, 2004, and marked as 
considered by the Examiner on November 1 7, 2004. 

Item 8 was made of record by the Examiner in the Office Action mailed May 13, 2004. 
Item 9 was made of record by the Examiner in the Office Action mailed November 29, 2004. 
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1 0. RELATED PROCEEDINGS APPENDIX 

None. 
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THE UNITED STATES PATENT AND TRADEMARK OFFICE 



"In re Application of: Ashkenazi et al. 
Serial No.: 09/903,925 
Filed: July 11,2001 



For: SECRETED AND 

TRANSMEMBRANE 
POLYPEPTIDES AND NUCLEIC 
ACIDS 



Group Art Unit: 1647 



Examiner: Fozia Hamid 



■ CERTIFICATE OF MAILING " 

*I hereby certify that this -correspondence is being deposited with the United, j 
States Postal Service with sufficient postage asifiist class mail in an envelope f|$, 




DECLARATION OF AUDREY D. GODDARD, Ph.D UNDER 37 CF.R. S L132 

Assistant Commissioner of Patents 
Washington, D.C. 2023 1 



Sir: 

I, Audrey D. Goddard, Ph.D. do hereby declare and say as follows: 

1 . I am a Senior Clinical Scientist at the Experimental Medicine/BioOncology, Medical 
Affairs Department of Genentech, Inc., South San Francisco, California 94080. 

2. Between 1993 and 2001, 1 headed the DNA Sequencing Laboratory at the Molecular 
Biology Department of Genentech, Inc. During this time, my responsibilities included the 
identification and characterization of genes contributing to the oncogenic process, and determination 
of the chromosomal localization of novel genes. 

3 . My scientific Curriculum Vitae, including my list of publications, is attached to and 
forms part of this Declaration (Exhibit A). 
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Serial No.: * 
Filed;* 

4. I am familiar with a variety of techniques known in the art for detecting and 
quantifying the amplification of oncogenes in cancer, including the quantitative TaqMan PCR (i.e., 
"gene amplification") assay described in the above captioned patent application. 

5. The TaqMan PCR assay is described, for example, in the following scientific 
publications: Higuchi et al., Biotechnology 10:413-417 (1992) (Exhibit B); Livak et al. 9 PCR 
Methods AddL 4:357-362 (1995) (Exhibit C) and Heid et al, Genome Res. 6:986-994 (1996) 
(Exhibit D). Briefly, the assay is based on the principle that successful PCR yields a fluorescent 
signal due to Taq DNA polymerase-mediated exonuclease digestion of a fluorescently labeled 
oligonucleotide that is homologous to a sequence between two PCR primers. The extent of 
digestion depends directly on the amount of PCR, and can be quantified accurately by measuring the 
increment in fluorescence that results from decreased energy transfer. This is an extremely sensitive 
technique, which allows detection in the exponential phase of the PCR reaction and, as a result, 
leads to accurate determination of gene copy number. 

6. The quantitative fluorescent TaqMan PCR assay has been extensively and 
successfully used to characterize genes involved in cancer development and progression. 
Amplification of protooncogenes has been studied in a variety of human tumors, and is widely 
considered as having etiological, diagnostic and prognostic significance. This use of the quantitative 
TaqMan PCR assay is exemplified by the following scientific publications: Pennica et al, Proc. 
Natl. Acad Sci. USA 95(25): 147 17- 14722 (1998) (Exhibit E); Pitti et al, Nature 
396(6712);699-703 (1998) (Exhibit F) and Bieche et g/. Jnt. J. Cancer 78:661-666 (1998) (Exhibit 
G), the first two of which I am co-author. In particular, Pennica et al have used the quantitative 
TaqMan PCR assay to study relative gene amplification of WISP and c-myc in various cell lines, 
colorectal tumors and normal mucosa. Pitti et al studied the genomic amplification of a decoy 
receptor for Fas ligand in lung and colon cancer, using the quantitative TaqMan PCR assay. Bieche 
et al used the assay to study gene amplification in breast cancer. 
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Filed: * 

7. It is my personal experience that the quantitative TaqMan PCR technique is 
technically sensitive enough to detect at least a 2-fold increase in gene copy number relative to 
control It is further my considered scientific opinion that an at least 2-fold increase in gene copy 
number in a tumor tissue sample relative to a normal (i.e., non-tumor) sample is significant and 
useful in that the detected increase in gene copy number in the tumor sample relative to the normal 
sample serves as a basis for using relative gene copy number as quantitated by the TaqMan PCR 
technique as a diagnostic marker for the presence or absence of tumor in a tissue sample of unknown 
pathology. Accordingly, a gene identified as being amplified at least 2-fold by the quantitative 
TaqMan PCR assay in a tumor sample relative to a normal sample is useful as a marker for the 
diagnosis of cancer, for monitoring cancer development and/or for measuring the efficacy of cancer 
therapy. 

8. I declare further that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true. I declare that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the validity of the application or any 
patent issuing thereon. 

Date 





Audrey D. Goddard, Ph.D. 
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AUDREY D. GODDARD, Ph.D. 



110 Congo St. 
San Francisco, CA, 94131 
415.841.9154 
415.819.2247 (mobile) 
agoddard@pacbell.net 

PROFESSIONAL EXPERIENCE 

Gerientech, Inc. 1993-present 
South San Francisco, CA 

2001 - present Senior Clinical Scientist 

Experimental Medicine / BioOncology, Medical Affairs 

Responsibilities: 

• Companion diagnostic oncology products 

• Acquisition of clinical samples from Genentech's clinical trials for translational research 

• Translational research using clinical specimen and data for drug development and 
diagnostics 

• Member of Development Science Review Committee, Diagnostic Oversight Team, 21 CFR 
Part 11 Subteam 

Interests: 

• Ethical and legal implications of experiments with clinical specimens and data 

• Application of pharmacogenomics in clinical trials 



1998 - 2001 Senior Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities: 

• Management of a laboratory of up to nineteen -including postdoctoral fellow, associate 
scientist, senior research associate and research assistants/associate levels 

• Management of a $750K budget 

• DNA sequencing core facility supporting a 350+ person research facility. 

• DNA sequencing for high throughput gene discovery, - ESTs, cDNAs, and constructs 

• Genomic sequence analysis and gene identification 

• DNA sequence and primary protein analysis 

Research: 

• Chromosomal localization of novel genes 

• Identification and characterization of genes contributing to the oncogenic process 

• Identification and characterization of genes contributing to inflammatory diseases 

• Design and development of schemes for high throughput genomic DNA sequence analysis 

• Candidate gene prediction and evaluation 



Genentech, Inc. 
1 DNA Way 

South San Francisco, CA, 94080 

650.225.6429 

goddarda@gene.com 
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1993-1998 Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities 

• DNA sequencing core facility supporting a 350+ person research facility 

• Assumed responsibility for a pre-existing team of five technicians and expanded the group 
into fifteen, introducing a level of middle management and additional areas of research 

• Participated in the development of the basic plan for high throughput secreted protein 
discovery program - sequencing strategies, data analysis and tracking, database design 

• High throughput EST and cDNA sequencing for new gene identification. 

• Design and implementation of analysis tools required for high throughput gene identification. 

• Chromosomal localization of genes encoding novel secreted proteins. 

Research: 

• Genomic sequence scanning for new gene discovery. 

• Development of signal peptide selection methods. 

• Evaluation of candidate disease genes. 

• Growth hormone receptor gene SNPs in children with Idiopathic short stature 

Imperial Cancer Research Fund 1989-1992 
London, UK with Dr. Ellen Solomon 

6/89 -12/92 Postdoctoral Fellow 

• Cloning and characterization of the genes fused at the acute promyelocytic leukemia 
translocation breakpoints on chromosomes 17 and 15. 

• Prepared a successfully funded European Union multi-center grant application 

McMaster University 1983 
Hamilton, Ontario, Canada with Dr. G. D. Sweeney 

5/83 - 8/83: NSERC Summer Student 

• In vitro metabolism of p-naphthoflavone in C57BI/6J and DBA mice 



EDUCATION 

University of Toronto 

Toronto, Ontario, Canada. 1989 
Department of Medical 
Biophysics. 

Honours B.Sc McMaster University, 

"The in vitro metabolism of the cytochrome P-448 Hamilton, Ontario, Canada. 1983 
inducer p-naphthoflavone in C57BL/6J mice." Department of Biochemistry 
Supervisor: Dr. G. D. Sweeney 



Ph.D. 

"Phenotypic and genotypic effects of mutations in 
the human retinoblastoma gene." 
Supervisor: Dr. R. A. Phillips 
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ACADEMIC AWARDS 



Imperial Cancer Research Fund Postdoctoral Fellowship 

Medical Research Council Studentship 

NSERC Undergraduate Summer Research Award 

Society of Chemical Industry Merit Award (Hons. Biochem.) 

Dr. Harry Lyman Hooker Scholarship 

J.L.W. Gill Scholarship 

Business and Professional Women's Club Scholarship 
Wyerhauser Foundation Scholarship 



1989 
1983 
1983 
1983 
1981 
1981 
1980 
1979 



1992 
1988 



1983 
1982 
1981 
1980 



INVITED PRESENTATIONS 

Genentech's gene discovery pipeline: High throughput identification, cloning and 
characterization of novel genes. Functional Genomics: From Genome to Function, Litchfield 
Park, AZ, USA. October 2000 

High throughput identification, cloning and characterization of novel genes. G2K:Back to 
Science, Advances in Genome Biology and Technology I. Marco Island, FL, USA. February 



Quality control in DNA Sequencing: The use of Phred and Phrap. Bay Area Sequencing 
Users Meeting, Berkeley, CA, USA. April 1999 

High throughput secreted protein identification and cloning. Tenth International Genome 
Sequencing and Analysis Conference, Miami, FL, USA. September 1998 

The evolution of DNA sequencing: The Genentech perspective. Bay Area Sequencing Users 
Meeting, Berkeley, CA, USA. May 1998 

Partial Growth Hormone Insensitivity: The role of GH-receptor mutations in Idiopathic Short 
Stature. Tenth Annual National Cooperative Growth Study Investigators Meeting, San 
Francisco, CA, USA. October, 1996 

Growth hormone (GH) receptor defects are present in selected children with non-GH-deficient 
short stature: A molecular basis for partial GH-insensitivity. 76 th Annual Meeting of The 
Endocrine Society, Anaheim, CA, USA. June 1994 

A previously uncharacterized gene, myl, is fused to the retinoic acid receptor alpha gene in 
acute promyelocyte leukemia. XV International Association for Comparative Research on 
Leukemia and Related Disease, Padua, Italy. October 1991 



2000 



PATENTS 

Goddard A, Godowski PJ, Gurney AL. NL2 Tie ligand homologue polypeptide. Patent 
Number. 6,455,496. Date of Patent: Sept. 24, 2002. 

Goddard A, Godowski PJ and Gurney AL. NL3 Tie ligand homologue nucleic acids. Patent 
Number: 6,426,218. Date of Patent: July 30, 2002. 

Godowski P, Gurney A, Hillan KJ, Botstein D, Goddard A, Roy M, Ferrara N, Tumas D, 
Schwall R. NL4 Tie ligand homologue nucleic acid. Patent Number: 6,4137,770. Date of 
Patent: July 2, 2002. 

Ashkenazi A, Fong S, Goddard A, Gurney AL, Napier MA, Tumas D, Wood Wl. Nucleic acid 
encoding A-33 related antigen poly peptides. Patent Number: 6,410,708. Date of Patent:: 
Jun. 25, 2002. 

Botstein DA, Cohen RL, Goddard AD, Gurney AL, Hillan KJ, Lawrence DA, Levine AJ. 
Pennica D, Roy MA and Wood Wl. WISP polypeptides and nucleic acids encoding same. 
Patent Number 6,387,657. Date of Patent: May 14, 2002. 

Goddard A, Godowski PJ and Gurney AL. Tie ligands. Patent Number: 6,372,491. Date of 
Patent: April 16, 2002. 

Godowski PJ, Gurney AL, Goddard A and Hillan K. TIE ligand homologue antibody. Patent 
Number: 6,350,450. Date of Patent: Feb. 26, 2002. 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Tie 
receptor tyrosine kinase ligand homologues. Patent Number: 6,348,351. Date of Patent: 
Feb. 19, 2002. 

Goddard A, Godowski PJ and Gurney AL; Ligand homologues. Patent Number: 6,348,350. 
Date of Patent: Feb. 19, 2002. 

Attie KM, Carlsson LMS, Gesundheit N and Goddard A. Treatment of partial growth 
hormone insensitivity syndrome. Patent Number: 6,207,640. Date of Patent: March 27, 
2001. 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Nucleic 
acids encoding NL-3. Patent Number: 6,074,873. Date of Patent: June 13, 2000 

Attie K, Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,824,642. Date of Patent: October 20, 1998 

Attie K, Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,646,1 13. Date of Patent: July 8, 1997 

Multiple additional provisional applications filed 
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PUBLICATIONS 

Seshasayee D, Dowd P, Gu Q, Erickson S, Goddard AD Comparative sequence analysis of 
the HER2 locus in mouse and man. Manuscript in preparation. 

Abuzzahab MJ, Goddard A, Grigorescu F, Lautier C, Smith RJ and Chernausek SD. Human 
IGF-1 receptor mutations resulting in pre- and post-natal growth retardation. Manuscript in 
preparation. 

Aggarwal S, Xie, M-H, Foster J, Frantz G, Stinson J, Corpuz RT, Simmons L, Hillan K f 
Yansura DG, Vandlen RL, Goddard AD and Gurney AL. FHFR, a novel receptor for the 
fibroblast growth factors. Manuscript submitted. 

Adams SH, Chui C, Schilbach SL, Yu XX, Goddard AD, Grimaldi JC, Lee J, Dowd P, Colman 
S., Lewin DA. (2001) BFIT, a unique acyl-CoA thioesterase induced in thermogenic brown 
adipose tissue: Cloning, organization of the human gene, and assessment of a potential link 
to obesity. Biochemical Journal 360: 1 35-142. 

Lee J Ho WH. Maruoka M. Corpuz RT. Baldwin DT. Foster JS. Goddard AD. Yansura DG. 
Vandlen RL. Wood Wl. Gurney AL. (2001) IL-17E, a novel proinflammatory ligand for the IL- 
17 receptor homolog IL-17RM. Journal of Biological Chemistry 276(2): 1660-1664. 

Xie M-H, Aggarwal S, Ho W-H, Foster J, Zhang Z, Stinson J, Wood Wl, Goddard AD and 
Gurney AL. (2000) Interleukin (IL)-22, a novel human cytokine that signals through the 
interferon-receptor related proteins CRF2-4 and IL-22R. Journal of Biological Chemistry 275: 
31335-31339. 

Weiss GA, Watanabe CK, Zhong A, Goddard A and Sidhu SS. (2000) Rapid mapping of 
protein functional epitopes by combinatorial alanine scanning. Proc. Natl. Acad. Sci. USA 97: 
8950-8954. 

Guo S, Yamaguchi Y, Schilbach S, Wada T.;Lee J, Goddard A, French D , Handa H, 
Rosenthal A. (2000) A regulator of transcriptional elongation controls vertebrate neuronal 
development. Nature 408: 366-369. 

Yan M, Wang L-C, Hymowitz SG, Schilbach S, Lee J, Goddard A, de Vos AM, Gao WQ, Dixit 
VM. (2000) Two-amino acid molecular switch in an epithelial morphogen that regulates 
binding to two distinct receptors. Science 290: 523-527. 

Sehl PD, Tai JTN, Hillan KJ, Brown LA, Goddard A, Yang R, Jin H and Lowe DG. (2000) 
Application of cDNA microarrays in determining molecular phenotype in cardiac growth, 
development, and response to injury. Circulation 101: 1990-1999. 

Guo S, Brush J, Teraoka H, Goddard A, Wilson SW, Mullins MC and Rosenthal A. (1999) 
Development of noradrenergic neurons in the zebrafish hindbrain requires BMP, FGF8, and 
the homeodomain protein soulless/Phox2A. Neuron 24: 555-566. 

Stone D, Murone, M, Luoh, S, Ye W, Armanini P, Gurney A, Phillips HS, Brush, J, Goddard 
A, de Sauvage FJ and Rosenthal A. (1999) Characterization of the human suppressor of 
fused; a negative regulator of the zinc-finger transcription factor Gli. J. Cell Sci. 112: 4437- 
4448. 

Xie M-H, Holcomb I, Deuel B, Dowd P, Huang A, Vagts A, Foster J, Liang J, Brush J, Gu Q, 
Hillan K, Goddard A and Gurney, A.L. (1 999) FGF-1 9, a novel fibroblast growth factor with 
unique specificity for FGFR4. Cytokine 11: 729-735. 
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Yan M, Lee J, Schilbach S, Goddard A and Dixit V. (1999) mE10, a novel caspase 
recruitment domain-containing proapoptotic molecule. J. Biol. Chem. 274(15): 10287-10292. 

Gurney AL, Marsters SA, Huang RM, Pitti RM, Mark DT f Baldwin DT, Gray AM, Dowd P, 
Brush J, Heldens S, Schow P t Goddard AD, Wood Wl, Baker KP, Godowski PJ and 
Ashkenazi A. (1999) Identification of a new member of the tumor necrosis factor family and its 
receptor, a human ortholog of mouse GITR. Current Biology 9(4): 215-218. 

Ridgway JBB, Ng E, Kern JA ,Lee J, Brush J, Goddard A and Carter P. (1999) Identification 
of a human anti-CD55 single-chain Fv by subtractive panning of a phage library using tumor 
and nontumor cell lines. Cancer Research 59: 2718-2723. 

Pitti RM, Marsters SA, Lawrence DA, Roy M, Kischkel FC, Dowd P, Huang A, Donahue CJ, 
Sherwood SW, Baldwin DT, Godowski PJ, Wood Wl, Gurney AL, Hillan KJ, Cohen RL, 
Goddard AD, Botstein D and Ashkenazi A. (1998) Genomic amplification of a decoy receptor 
for Fas ligand in lung and colon cancer. Nature 396(6712): 699-703. 

Pennica D, Swanson TA, Welsh JW, Roy MA, Lawrence DA, Lee J, Brush J, Taneyhill LA, 
Deuel B, Lew M, Watanabe C, Cohen RL, Melhem MF, Finley GG, Quirke P, Goddard AD, 
Hillan KJ, Gurney AL, Botstein D and Levine AJ. (1998) WISP genes are members of the 
connective tissue growth factor family that are up-regulated in wnt-1 -transformed cells and 
aberrantly expressed in human colon tumors. Proc. Natl. Acad. Sci. USA. 95(25): 14717- 
14722. 

Yang RB, Mark MR, Gray A, Huang A, Xie MH, Zhang M, Goddard A, Wood Wl, Gurney AL 
and Godowski PJ. (1998) Toll-like receptor-2 mediates lipopolysaccharide-induced cellular 
signalling. Nature 395(6699): 284-288. 

Merchant AM, Zhu Z, Yuan JQ, Goddard A, Adams CW, Presta LG and Carter P. (1998) An 
efficientroute to human bispecific IgG. Nature Biotechnology 16(7): 677-681. 

Marsters SA, Sheridan JP, Pitti RM, Brush J, Goddard A and Ashkenazi A (1998) 
Identification of a ligand for the death-domain-containing receptor Apo3. Current Biology 8(9): 
525-528. 

Xie J, Murone M, Luoh SM, Ryan A, Gu Q, Zhang C, Bonifas JM, Lam CW, Hynes M, 
Goddard A, Rosenthal A, Epstein EH Jr. and de Sauvage FJ. (1998) Activating Smoothened 
mutations in sporadic basal-cell carcinoma. Nature. 391(6662): 90-92. 

Marsters SA, Sheridan JP, Pitti RM, Huang A, Skubatch M, Baldwin D, Yuan J, Gurney A, 
Goddard AD, Godowski P and Ashkenazi A. (1997) A novel receptor for Apo2L/TRAIL 
contains a truncated death domain. Current Biology. 7(12): 1003-1006. 

Hynes M, Stone DM, Dowd M, Pitts-Meek S, Goddard A, Gurney A and Rosenthal A. (1997) 
Control of cell pattern in the neural tube by the zinc finger transcription factor G/M. Neuron 
19:15-26. 

Sheridan JP, Marsters SA, Pitti RM, Gurney A., Skubatch M, Baldwin D, Ramakrishnan L, 
Gray CL, Baker K, Wood Wl, Goddard AD, Godowski P, and Ashkenazi A. (1997) Control of 
TRAIL-lnduced Apoptosis by a Family of Signaling and Decoy Receptors. Science 277 
(5327): 818-821. . 
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Goddard AD, Dowd P, Chernausek S, Geffner M, Gertner J, Hintz R, Hopwood N, Kaplan S, 
Plotnick L, Rogol A, Rosenfield R, Saenger P, Mauras N, Hershkopf R, Angulo M and Attie, K. 
(1997) Partial growth hormone insensitivity: The role of growth hormone receptor mutations in 
idiopathic short stature. J. Pediatr. 131: S51-55. 

Klein RD, Sherman D ? Ho WH, Stone D, Bennett GL, Moffat B, Vandlen R, Simmons L, Gu Q, 
Hongo JA f Devaux B f Poulsen K, Armanini M, Nozaki C, Asai N, Goddard A, Phillips H, 
Henderson CE, Takahashi M and Rosenthal A. (1997) A GPi-linked protein that interacts with 
Ret to form a candidate neurturin receptor. Nature. 387(6634): 717-21. 

Stone DM, Hynes M, Armanini M, Swanson TA, Gu Q, Johnson RL, Scott MP, Pennica D, 
Goddard A, Phillips H, Noll M, Hooper JE, de Sauvage F and Rosenthal A. (1996) The 
tumour-suppressor gene patched encodes a candidate receptor for Sonic hedgehog. Nature 
384(6605): 129-34. 

Marsters SA, Sheridan JP, Donahue CJ, Pitti RM, Gray CL, Goddard AD, Bauer KD and 
Ashkenazi A. (1996) Apo-3, a new member of the tumor necrosis factor receptor family, 
contains a death domain and activates apoptosis and NF-kappa p. Current Biology 6(12): 
1669-76. 

Rothe M, Xiong J, Shu HB, Williamson K, Goddard A and Goeddel DV. (1996) l-TRAF is a 
novel TRAF-interacting protein that regulates TRAF-mediated signal transduction. Proc. Natl. 
Acad. Sci. USA 93: 8241-8246. 

Yang M, Luoh SM, Goddard A, Reilly D, Henzel W and Bass S. (1996) The bglX gene 
located at 47.8 min on the Escherichia coli chromosome encodes a periplasmic beta- 
glucosidase. Microbiology 142: 1659-65. 

Goddard AD and Black DM. (1996) Familial Cancer in Molecular Endocrinology of Cancer. 
Waxman, J. Ed. Cambridge University Press, Cambridge UK, pp. 187-21 5. 

Treanor JJS, Goodman L, de Sauvage F, Stone DM, Poulson KT, Beck CD, Gray C, Armanini 
MP, Pollocks RA, Hefti F, Phillips HS, Goddard A, Moore MW, Buj-Bello A, Davis AM, Asai N, 
Takahashi M, Vandlen R, Henderson CE and Rosenthal A. (1996) Characterization of a . 
receptor for GDNF. Nature 382: 80-83. 

Klein RD, Gu Q, Goddard A and Rosenthal A (1996) Selection for genes encoding secreted 
proteins and receptors. Proc. Natl. Acad. Sci. USA 93: 7108-7113. 

Winslow JW, Moran P, Valverde J, Shih A, Yuan JQ, Wong SC, Tsai SP, Goddard A, Henzel 
WJ, Hefti F and Caras I. (1995) Cloning of AL-1, a ligand for an Eph-related tyrosine kinase 
receptor involved in axon bundle formation. Neuron 14: 973-981. 

Bennett BD, Zeigler FC, Gu Q, Fendly B, Goddard AD, Gillett N and Matthews W. (1995) 
Molecular cloning of a ligand for the EPH-related receptor protein-tyrosine kinase Htk. Proc. 
Natl. Acad. Sci. USA 92: 1866-1870. 

Huang X, Yuang J, Goddard A, Foulis A, James RF, Lernmark A, Pujol-Borrell R, 
Rabinovitch A, Somoza N and Stewart TA. (1995) Interferon expression in the pancreases of 
patients with type I diabetes. Diabetes 44: 658-664. 

Goddard AD, Yuan JQ, Fairbairn L, Dexter M, Borrow J, Kozak C and Solomon E. (1995) 
Cloning of the murine homolog of the leukemia-associated PML gene. Mammalian Genome 
6: 732-737. 
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Goddard AD, Covello R, Luoh SM, Clackson T, Attie KM, Gesundheit N, Rundle AC, Wells 
JA, Carlsson LMTI and The Growth Hormone Insensitivity Study Group. (1995) Mutations of 
the growth hormone receptor in children with idiopathic short stature. N. Engl. J. Med. 333: 
1093-1098. 

Kuo SS, Moran P, Gripp J, Armanini M, Phillips HS, Goddard A and Caras IW. (1994) 
Identification and characterization of Batk, a predominantly brain-specific non-receptor protein 
tyrosine kinase related to Csk. J. Neurosci. Res. 38: 705-715. 

Mark MR, Scadden DT, Wang Z, Gu Q, Goddard A and Godowski PJ. (1994) Rse, a novel 
receptor-type tyrosine kinase with homology to Axl/Ufo, is expressed at high levels in the 
brain. Journal of Biological Chemistry 269: 10720-10728. 

Borrow J, Shipley J f Howe K, Kiely F, Goddard A, Sheer D, Srivastava A, Antony AC, 
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IMULTANEOUS AMPLIFICATION AND DETECTION I 
SPECIFIC DMA SEQUENCES 

Russell ffiguchi*, Gavixx DolEnger 1 , P. Sean Wal*h and Robert Griffith 

Roche Molecular Syuccn*, Inc.. 1400 SSrd Sl, EmexYvttk, CA 94008- 'Cfcron Corporation, 1400 53rd 5c, Emeryville, CA 
94608, ♦Corresponding author. 



We have enhanced the polymerase chain 
reaction (PGR) such that specific DNA 
sequences can be detected without open- 
ing the reaction tube* This enhancement 
requires the addition of ethidium bromide 
(EtBr) to a PGR. Since the fluorescence of 
BtBr increases in die presence of double* 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be easily moni- 
tored externally. In fact, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simulta- 
neously amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PCR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiring high sample through- 
put 

Although the potential benefits of PGR 1 to cuu- 
tca! diagnostics arc wefl known?* 5 , it fe still not 
widely used in this setting, even though ** is 
femf year* aiuco thcrwy^Ue DNA polymer- 
ase** made PCR practical. Some of the reasons for its slow, 
acceptance arc high cost, tack of automation of pre^ and 
post-PCR processing steps, and fake positive results, from 
carryovcT-contami nation. The first two points arc related 
in that labor is the largest contributor to cost at die present 
stage of PCR development. Most Current assays require 
some form of "downstream" processing once tbermocy* 
ding ts done in order in determine whether the target 
DNA sequence was- present and has amplified. These 
include DNA hybridisation** gel ele^pborests with or 
without use of restriction digestion/'*, HPIXr, or capillajy 
electrophoresis 10 . These methods are labor-intense, hare 
tow throughput, and arc difficult to automate. The third 
point is abo closely related to downstream processing. 
The handling of the PCR product in these downstream 
processes increases die chances that amplified DNA 'wilt 
spread through the typing lab, resulting in a .risk of 



"carryover" false positives in subsequent testing 

These downstream processing steps would be elimi- 
nated if specific amplification and detection of amplified 
DNA took place simultaneously within an unopened re- 
action vessel Assays in which such different processes take 
place without the need to separate reaction components 
bavc been termed \TK>mogei*eous\ No truly homoge- 
neous PCR assay has been demonstrated to date, although 
progress towards this end has been reported. Chefaab, et 
al" developed a PCR product detection scheme using 
fluorescent primers that resulted in a fluorescent PCR 
product AUdc-specifu: primers, each with different Buo- 
I rescent tags, were used to indicate the genotype of the 
DNA. However, the unincorporated primers must still be 
removed in a do wnstream process in order to visualize the 
result Recently, Holland, et al, 13 , developed an assays in 
wruch the endogenous 5 r exonudease assay of Taq DNA 
polymerase was exploited to cleave a fabekri oligonucleo- 
tide probe. Hie probe would only dcave if PCR ampli- 
cation had produced its complementary sequence. In 
order to detect the cleavage products, however, a subse- 
quent process w again needed. 

We have developed a truly homogeneous assay for PGR 
and PCR product detection based upon the gready in- 
creased fluorescence that ethidium bromide and other 
DNA binding dyes exhibit when they are bound tcvds- 
DNA l4_ie . As outbned in Figure h a prototype PCR 
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1 Principle of simultaneous amplification and dcteaton of . 
PCR product, fee componcnu of a PCR COOUrtnn^ EtSr that arc 
fluorescent »rehstcd--fctBr itself, EtBr bo W tocrthcrfisDN Aor 
dsJ>N A. There is a large fluorescence enhancement when EtBr is 
bound to ONA aod twndmjr is greatly enhanced when DNA .is 
douhk-stranded. Ate* sumckmt <n> cfdcs of ^ PGR, the.net 
increase in d^GNA rcsuks in addfttonal EtBr binding, and a net 
increase in total fluorEsccncE: 
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H6M 2 Gel electrophoresis of PCR amplification products of the 
human, tutdcar gene, HLA DQtt, made in the presence of 
incr^wing amounts of EtBr (up to 6 H-g/tnl). The presence of 
EtBr Iva no obviouj effect on the ykM or spcoiftcky of amplifi- 
cation. 



A, 




B. 




FKKJtt } (A) fluorescence measurement* from PCRs that contain 
0.5 pgfau 1 EtBr and that are specific for Y-^irotno$OOTc repeat 
sequences. Five replicate PCRj were begun containing each of the 
DNA* specified. At each mdkattcd cyde, one of the five repueatc 
PCRs for each DNA *as removed from thermocydmg and Hs 
fluorescence measured, Unit* of fluorescence are arbitrary. <B) 
UV photography of PGR tube* (0,5 ml Eppcndorf-styie, pcfypro* 
pykne m»av<ttitrifusre *ubca) containing reactions, those sx&vt* 
ing from % rur male DNA Mid control reactions without any DNA, 
from (A), 



begins with primers that are single-stranded DNA (ss» 
DNA), dNTPs, and DNA polymerase: An amount of 
dsDNA containing the target sequence (target DNA) is 
also typically present. This amount can vary, depending 
on the application, from single-cell amounts of DNA 17 to 
micrograms per FCR^, If EtBr is present, the reagents 
that will fluoresce, in order of increasing fluorescence, are 
free EtBr itself* and EtBr bound to the single-stranded 
DNA primers and to the doubles trended target DNA (by 
its tntercaJation between the stacked bases of die DNA 
doobk^hcax). After the first denaturation cyde. target 
DNA win be largely single-stranded. After a KR h 
completed, the most significant change is the increase in 
the amount of dsDNA (the PGR product itself) of up to 
several tnk*6grams- Formerly free EtBr « bound to the 
additional dsDNA* resulting in an increase in fluores- 
cence* There is also some decrease in the amount of 
ssDNA primer, but because tbc binding of EtBr to ssDNA 
is much Jess than to dsDNA, the effect of this change on 
the total fluorescence of the sample is smalt The fluores- 
cence increase can be measured by directing excitation 
iUuminaiion through the walls of the amplification vessel 



before and after, or even ccmiinuously during, thermocy- 
ding. 

RESULTS 

PGR in the presence of EtBr. In order to assess the 
affect of EtBr in PGR, amplifications of the human HiA 
DQa gene* 9 were performed with the dye present at 
concentrations from 0,06 to 8.0 agfail (a typical concen- 
tration of EtBr used in staining of nucleic aads following 
get cfcarophoresis is 0.5 ug/mf). As shown in Figure 2, gel 
electrophoresis revealed little or no difference in the yield 
or quality of the amplification product whether EtBr was 
absent or present at any of these ameeirtiatierns, indicat- 
ing that EtBr does not inhibit PGR, 

Detection of human Y-dttOnwamw specific sr> 
uoences. Seo^nee^pedfie, fluorescence enhancement of 
EtBr as a result of PGR was demonstrated in a series of 
amplifications containing 0.5 ug/ml EtBr and primer? 
specific to repeat DNA sequences found on the human 
V-chromosomc^- These PCRs initially contained either 
60 ng male, 60 ng female, 2 ng male human or no DNA. 
Five replkate PCRs were begun for each DNA, After 
17, 21 , 24 and 29 cydes of therniocyding, a PGR k>t cadi 
DNA was removed from the thermocyder, and its fluo- 
rescence measured in a spectroflttorometer and plotted 
vs. amplication cyde number (Fiff. 3 A). The shape of this 
curve reflects the fact that by the rime an increase in 
fluorescence can he detected, die increase in DNA is 
becoming linear and not exponential with cyde number: 
As shown, the fluorescence increased about, three-fold 
orer the background fluorescence for the PCRs contain- 
ing human male DNA, but did not significantly increase 
for negative control PCRs, which contained either no 
DNA or huraan female DNA. The more male DNA 
present to begin with— 60 ng versus 2 ng— the fewer 
cycles were needed to give a detectable increase in fluo- 
rescence. Od elecWborests on the products of these 
amplifications showed that DNA fragments of the ex- 
pected skc were made in the male DNA containing 
reactions and that little DN A synthesis took place m the 
control samples. 

In addition, the increase in fluorescence was visualized 
by simply laying the completed, unopened PCRs on a UV 
transiUurninator and photographing them through a red 
filter. This is shown to figure 3B tor the reactions thai 
began with 2 ng mate DNA and those with no DNA. 

Detection of specific allele* of me human p-globm 
gene. In order to demonstrate that this apprbach has 
adequate spedfidty to allow genetic screening, a dttection 
of the sickle-cell anemia mutation was performed. Figure 
4 shows the fluorescence from completed ampfiftcauons 

containing EtBr (0.5 ^g/ml) a* detect** by photography 

of the reaction tubes on a UV Qrapiilluminalor. These 
reactions were performed using- pnrncrR specific for ci- 
ther the. wild-type or sickle-cell mutation of the human 
^globin gene". The spedfidty for each allele is i*^part<^ 
by placing the sickle-mutation site at the terminal 3 
nucleotide of one primer. By using an appropriate primer 
annealing temperature, primer extension— and thus an> 
plincaticm--can take place only if the 3' nucleotide oHhc 
primer is complementary to the $-globin allele present - 
Each pair of amplications shown in Figure 4 consists ot 
a reaction with etcher the wildHype alldc specific (left 
tube) or skkle-aUde specific (right tube) primers. Three 
different DNAs were typed: DNA from a homozygous, 
wHd-type frglobin mdividual (AA); from a heterozygous 
sickle p-giobin individual (AS); and from a homozygous 
sickle p-gtobb individual (SS). Each DNA (50 ng genomic 
DNA to start each PGR) was nnalyied m triplicate (3 pans 
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n f reactions each). The DNA type vas reflected in the 
^jative fluorescence intensities in each pair of completed 
{n pricatk>rt&. There was a significant increase in fluorea- 
^ce only where a p-globin allele DNA matched the 
primer act. When measured on a spectroflnorometer 
Mata not shown), this fluorescence was about three times 
|j*t present in a PCR where both p-gfobzn alleles were 
jflbiaatchcd to the primer set. Gel ckctrophofesfc (not 
shown) established that this increase in fluorescence was 
due to the synthesis of nearly a microgram of a DNA 
figment of the expected size for P-globtn. There was 
Htdc synthesis of dsDNA in reactions in . which the aikie- 
jpedfic primer was mismatched to both alleles* 

Continuous mentoring of a PCR. Using a fiber optic 
dcvtccrU i* possible to direct excitation illuminauon from 
? spoctrofl uorometer to a PCR undergoing thcrmocyding 
and to return its fluorescence to the specrroftuorometer. 
The fluorescence readout of such an arrangement di- 
rected at an EtBr-concainlng amplification of Y<hromo- 
uomC spcci&c sequences from 25 ng of liuman male DNA* 
is shown in Figure 5. The readout from a control tCR 
w iUi no target DNA is also shown. Thirty cycles of PCR 
v erc monitored for each- 

The ftuorcsccnce trace as a function of time dearly 
shows me effect of the theraocyeling. Fluorescence inten- 
sity rises and . rails mYCrscJy with temperature* The fluo- 
rescence intensity is minimum at the denatuxalion tem- 
perature (94°C) and maximum at the anneatWexiension 
temperature (SOX). In the negative-control PCR, these 
fluorescence maxima and minima do not change signifi- 
cantly over the thirty thcrraocycks, indicating that there is 
fade dsDNA synthesis without the appropriate target 
DNA. and there b litde if any bleaching of EtBr duting 
the continuous illumination or the sample. 

In the PCR containing male DNA, the fluorescence 
maxima at the annealing/extension temperature begin to 
increase at about 4000 seconds' of therroocychng, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable level. Note that the fluo- 
rescence minima at the denaturation temperature do not 
significantly increase, presumably because at this temper- 
ature there is no dsDN A for EtBr to bind. Thus the course 
of the amplification is followed by tracking the fluores- 
cence increase at the annealing temperature. Analysis bf 
the products of these two amplifications by gel clcetropbc- 
ixpis showed a DNA fragment of the ckpcU c d sire for the 
male DNA containing sample and no detectable DNA 
synthesis for the control sample. 

DISCUSSION 

Downstream processes such as hybridization to a se- 
gue nce-*peciric probe can' enhance die specificity of DNA 
dcccvUMii u> FCR. The enraioatkm of die*c processes 
means that* the specificity of this homogeneous assay 
depends solely on (hat of FCR. In the case of sickle-cell 
disease, we have shown that PGR alone has sufficient DNA 
sequence specificity to permit genetic screening. Using 
appropriate amplification conditions, there is little non- 
specific production of dsDNA in the absence of the 
appropriate target allele. 

The spedfictty required to detect pathogens can be 
more or less than that required* to do genetic screening, 
depending on the number of pathogens in the sample and 
the amount of other DNA chat must be taken .with the 
sample. A difficult target is HIV, which requires detection 
of a viral genome that can be at the level of n few copies 
per thousands of host cells 5 . Compared with genetic 
screening, which is performed on ceils containing at least 
one copy of die target sequence, HiV\detectj©n requires 
both more specrfidty and the input of mote total 
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FWJJtt 4 UV photography of PCR tubes obtaining amplications 
using EtBr that are specific to viWUtypc (A) or liekte (S> attda of 
the human £-giobin gene. The left o*each of tubes contauu 
allele- ipedfic primers to the wild-type alleles, the right tube 
primers to the sicWe allele. The photograph rw taicb after 30 
cycles of PCR, and the input DNAs and the alleles they contain 
are indicated- Fifty ng of DNA was used to bean PGR. Typing 
was done ui triplicate <3 pair* of FCRs) for each mpm DNA 
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IffittSE 5 Continuous, rcat-thne monitoring of a PCR. A fiberoptic 
was o**3 to carry excitation fight to a PCJR ra progress aod also 
emitted tight back to a fluoromctcr (sec Experimental jftotocol). 
Ampfificaooa "Using hutnan malo-DNA specific primers in 9 PCR 
Starting with tO ng of human male DNA (torn, or in a control 
PCR without PNA (bottom), were monitored. Thirty cyde? of 
PCR were foJtowed for each. Hie temperature Cycled between 
94°C (denatuFitkm) and 50°C (annealing and extension). Note in 
the male DNA PCR, the cyde {dmc> dependent increase in 
fluorescence at the anneaEn^extension temperature. 
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DNA — lip to microgram amount*— in order to have suf- 
ficient numbers of target scqucnocs. This large amount of 
starting DNA in an ampUheauou si^ifceaurty increases 
the background fluorescence over which any additional 
fluorescence produced by PGR must be detected. An 
additional comphcation that occurs with targets hx low 
copy-number is the formation of the "primer-dimer" 
artifact. This is the result of the extension of one primer 
using the other primer as a template. Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PGR amplification, and can compete whh 
true PGR targets if those targets are rare. The primer- 
dimcr product is of course dsDNA and thus is a potential 
source of false signal in this homogeneous assay. 

To increase PGR specificity and reduce the effect of 
primer-dimcr arn^Uftcanon, we are investigating a num- 
ber of approaches, including the use of nested-primer 
amplifications that take place in a single tube 8 , and the 
^hot-start*, in which nonspecific amplication is reduced 
by raising the temperature of the reaction before DNA 
synthesis begins 1 *. PieSminary results using these ap- 
proaches suggest thatnrlmcrHjttncT is effectively reduced 
and k is possible to detect the increase in EtBr fluores- 
cence in a PGR instigated by a single HIV genome in a 
background of iO 5 celts. With larger numbera of ccUs, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To reduce this background, it may 
be possible to use seo^enoe -specific DNA-binding dyes 
that can be made to preferentially bind PGR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the PGR product through a 5' *add-on" to 
the c4iranwdcc*tidc primer* 4 . 

We have shown that the detection of fluorescence 
generated by an EtBr-containing PGR is straightforward, 
both once PGR is completed and continuously during 
ihermocycfing. The ease with which automation of spe- 
cific DNA detection can be accomplished is the most 
promising aspect of this assay, The Huorescence analysis 
of completed PCRs is alrcadyjxwsihlc with existing instru- 
mentation in 96-well format** In this format, the fluores- 
cence in each PGR can be tjuantitated before, after, and 
even at selected points during therraocyciing by moving 
the rack of PCRs to a 96-microwcll plate fluorescence 

reader 40 . 4 

The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in principle. 
A direct extension of the apparatus used here is to have 
multiple nberoptics transmit the excitation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target UNA copy number. Figure 5 shows that 
the larger the amount of starting target DNA, the sc**ncr 
during Pf.R a fluorescence increase is detected. Prelimi- 
nary experiments <Higuchi and DoUinger, manuscript m 
preparation) with continuous roorutoring have shown a 
Kosiuvity to two-fold differences in initial target DNA 
concentration. 

Conversely, if the number of target molecules is 
known— a* it can be in genetic screeriing-^DtintJous 
monitoring may provide a means pf detecting farsc posi- 
tive and false negative result* With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable number of cycles of PGR 
increases in fluorescence detected before or after that 
cycle would indicate potential artrfacts* False negative 
results due to, for exajnple. inJu^lion of DNA polymer- 
ase, may be detected by including within each PGR an 
inefficiently ampHfying marker. This marker results in a 
fluorescence increase only after a large number of cy- 
cles— many more than arc necessary to detect a true 
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positive. If a sample fails to have a fluorescence increase 
after this many cycles, inhibition may be suspected. Since, 
in this assay, conclusions are drawn based on the presence 
or absence of fluorescence signal alone, such controls may 
>e important. In any event, before any test based On this 
mnciple is ready for the chnk, an assessment of its false 
positive/false negative rates wfll need to be obtained using 
a large number of known samples. 

In summary , the inclusion in PGR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it 
possible to detect specihe DNA ampUrication from outside 
the PGR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PCR 
in applications that demand the high throughput of 
samples* 

EXPERIMENTAL PROTOCOL 

Human HLA-DQ« gene ampliations containing EtBr. 
PCRs were set up in 100 J*l vermes containing lOraM TnVHQ 
pH 8.3; 50 mM KCi: 4 mM MgO*: 15 units ol Toe DNA 
polymerase (Peram-Ermcr Ccnn. Norwalk, CT); 20 twriote each 
of human HtA-DQo ' gene specific oligonudeodde primers 
tiHtS and CH27 19 and apprmcunatelj 10* copies of VQ* PCfc 
product diluted from a previous reaction. Ednciuim bromide 
(EtBr; Sigma) was used M tbe crmecntrauons indicated in Figure 
1 ThcTmocyding proceeded fur 20 cvdes in a model 430 
Oicnuocydcr (Vzr±h>-&mcr Ccwa, Norwalk, CT) usinga -step- 
cycle'' program of 94*G for I min. denaturauon and Wv tor 3<J 
sec anneatow and 72°C for 30 see eatension, 

Y-chiomomraM: specific PCR. PCRs (100 ul total reaction 
vulumc) COTtamimrU* y^fml EtBr were prepared as described 
for HLA-DQo, except vvith different primers and target DNAs. 
These PCRs contained 1 5 pmotc each male DNA-tpcctfic primer* 
YI.l and Vl.2*\ arid cither 60 ng male, 60 oefen»!e, 2 ngi^ 
or no human DNA. Therinocyding was 94*CTor 1 nun, and 6^C 
for 1 min using a "step-cycle" program. The number of cycles for 
a sample were as indicated an Figure 3. Fluorescence measure- 
ment is described below. , 

Ailck-spceific, human £-gloMu 500* PCR* AmpUncauons of 
100 iU -volume' using 0-5 jig/ml of XtBr were prepared as 
described for HUUDQ* above except with diflcrall primers 2nd 
target DNAs. These PCRs contained enter Pfinw pair l?^™ 
Hp HA (wild-type globm speaficpruners) or HOmipi4S (sjek- 
ic-globin specific primers) at 10 pmole ea<* pnrncr per PCR, 
These prhiers were developed by Wu ct aL a \ Three different 
target DNA.t were u*cd tn separate amr^ificatiOns--50 ng caco. ot 
human DNA that was homozygous for the licUe trait (SS)» DMA 
that was rictcrorTgow for the sickle trak <A$\ or DNA that vras 
homozygous for the w.t- gloom (AA). Thcrrnocycfcig was for 30 
cycles at *TC for 1 mm. and 5S«C for 1 rnim usutca ^eyefe 
program. An annealing temperature of SSy h*t\ hcen shown by 
Wo et al 2i to provide ^e^pecmc ampIifVcaUo^ ComP^J 
PCRs were photographed uirough a red ^^Sn™2 
after placing the rcacnon twbes atop a model TM-S6 tranflfflumi- 
nator (UV-producu SanGabriel, CA). 

Fhiereseence m^nrerneiH. Huoresceoce mcasiircroen ts were 
mad* on PCRs containing EtBr m a °^ oromC ^ 
(SFEX Edison. NJ). txd^tion was at the 500 wn band with 
itouVz nm bandwidth with « ,00 435 ntn <^^^fJ^ 
Crist, Inc. Irvine. CA) to exclude sc^^f ^'Jz^lS 
y K ht was detected at 570 nm with a bandwidth of about 7 nm. An 
OG 530 »m cut-off filter was used to remove the exotauon tt#u« 
ContitHiom rmoracence mWoting of PGR. Contmi^oui 
monitoring of a PCR in progress was accompiisbed using «ic 
spcctrofiuSromeier and setdngs descrrbed Above as well as a 
fioerortic accessory (SPEX cat no. 1950) to both send cxcttauon 
light uVaod receive emitted Ught from. a rcRptaced m a weU of 
a model 480 merrnocydcr (Perkin-EJmer Cetus^ The probe end 
of the ntx^fitkcabte was attached ^^^JJ^^P^gJ^ 



.pen top of a PCR tube (a 0^ ml t^rT^PJ*^ cenUTEUgc tub^ , 
with its cap removed) erfectrvely aeahog rL The exposed top ^ 
the PCR tuoe and the end of the fiberopuc caWc were rfijcWcO 
from room light and the room lights vere kept dimmed d«rmg 
each run. The monitored PCR was an ampTHicadon of * <S}1 *' 
rrKWOm^pednc repeat secjvences as describe above, except 
usfijff an anncah'ndcxtension tenrperauirc of 50°U The reacuon 
was covered with mineral oil <2 drops) to prevent cvarjoranon. 
ThejTOCKycHmrand fiuoresccncc measurement vere started 
multancously. A time-base S»u with a 10 second mtegranoo mne 
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usm u«sd and the emission signal was ratioed to' tbc excitation 
siimal to control for ch»Offe* In Ji^ht-sourcc intercity. jfcua.wcre 
fleeted using the dntfOOOf. version 5,5 (SPEX) data systcm- 

\V« tt^ni Bob Jones for help with the spectrofluontteiric 
ffrt/utfrcttWit* and HcajherbeH Fonjy for editing this manuscript 

Jteft*et»CO» 

1. M«Jlw» K.» Paloon* F., Schai< Safli. R., Hot, G, and Erficlu H. 

Specific ensymatk atnnldkzilion of DNA n» tfifni: The polytoer- 
nsc cham reaction. C5W5QB M;263~2?S. 

2. whitr. T.J., Arnhcim. N. *nd E*fch. M. A 1 989. The poJfracrwc 
chain reaction. Trends Genet, 5:l$fi-l$9. 

3. ErMv H. Av, Gelfend. D. and SxAisky.JJL 1991 . Recent »d*anccs in 
the polymerase chain reaction. Science 25X4 1643- 1651. 

4. Saifci. R. IL« Gclfartd, P H., StoffeL S., ScJurt SJ.4 Higudu". St, 
H«ro. G,T., Mullis* R.B. wd Erficft, H.A. I98&. Froncr-dirccccd 
cttzymMk amplification of DMA ™Hb a thermostable I>NA pohrowr* 
am. Science 23*4*7-491. 

5. JUiki. JEt K., Wabb. P. Lcveuson, C. H. and ErficH, H. K 1989. 
Ccoctk aoaJinls «f amplified DNA *ich munotttized se^«eaec-soeelBc 
ttJlgoncK-koodc probe*. Proc. Natl. AcaiL 8cl USA W.-ffil04B4, 

6. KwoiuS.y..Maci.D.H.,MuHi«.K.B^ Pwew, BJ., Ehiikh. C. 0^ 
EJaix. D, and Friedman-Xicn, A S. X9Q7. Idautifciticn cC human 
inumiuodefidcncy vinif sequence* bjr udng in vifrv enzretatk ample- 
Acadon nod oligomer cleavage dcaccOOfU J. VlroL 61:1650*16*4. 

V, CnchM> 4 F. Y\ DiAerty. M„ Cai. S. P.. Kan, Y. W., Coopci, S. and 
Rubin, UL M, 1987. Detection of ridtle ccQ anemia and thala*wnw/i. 
NrtuirK*9:29S-294. 

8, Horn, C. T.. Kdunfc, B. and Khatfer, R. W. Id89, Amplification of a 
tnghly polymorphic VNTR «aymcm by tbc polymerase chain reaction- 
Nuc AelasRe*. I&?I40. 

9, K«u, E. U. >od Dong. M. W. 1990. Rapid analysis and purification of 
potymcra#c chain reaction products bv high-perfonnaiwc liquid chro- 
infl^jjpraphyr Bkxcehjno^$i54o-6&, 

]Q. HdRcr, D. Cohen, AS. and Kargcr, B. L- 1990. Separation of 
pNA restriction fragments by high performance capifajy electropho- 
resis whh low and zero crcwsHnWed pah/acryiaoaklc Using continuous 
and poised dearie MeJaxJ. Ghroaiaiojr. 

If. Kwok, 3. V. and Higucht, R. C 1989, Avoiding fibe posfcm* -wita 
PGR. Nature 339*23 7-233. 

12, ChiU*b t P.F. and Kan, Y. W. 19*9, Dettcdoo of specinc DNA 
Kquciteca by Hnoreaceace ampIUtctdon: a color cosnpfcoicaiailon 
aw. rrcx. N*H Acad: Sd. U5A »:9l7o-^182. 

13. Kotond, P. Abramson, R. Wation. K- and Cetfand, D,K, 



199h Dctectton of specific poh/werase chaio regojon pn>duci hv 
otilizinr cbe 5' to Z' caonulcafe tetkiiy e-C Therm** ftn«ir«. nw7 
pc5^Sa»e, Proc Nad. Aead. Sd. OSA BS:72^7W)/ DNA 
H, MarlovitvJ., Roquca, B. P. and U Pec<|, J. B. 1979. Ethidium dimer 
a pen reagent for the fiuoriwetrie determination of nwdew- T-SV 
Atoai.BfccW94i2S9-3$l ««« »CW.. 

15. Kaptwcin!*!, j. and Sbct. W. 1979. lmcraetioa$ of 4',6-dttmidmc-2 
olKQ^aidolc with tyathctk poJynucieoddea. Nuc Acids Res. 6;5&19-! 
5534. 

16. Searte, M. & and Exnbrey, K.f. 1990. Sonjcnce««peeiflc intcraetion of 
Howcht 3525ft with the mmor groove of an ^denine-troa DKa 
do pkx studied in soMioo by 'H NMRicpoctroxopy. Nue Adda Res. 

17. UH.R,CyUH»te»,U*B.,C«i.*r^ A ^ 
AmKeini, K. 1988. AinpR6cajion and analj^u of DNA *^ (iCIU . Cj 
Stnele human sperm and diploid ceils. Nature S55ril4_^j7 

18- AWiott l*L A, PofcTc*, B.J., Byrne, B» C. Kwot, S. Snin»kv I I 
and Erfkh, R A 1988. En*yro*oc gene ampGrtcatKra: (n^annui^ iii 
muotha(rrt> methods for detcainst pro*iraf DNA 8tqp8aed: in vitro. I„ 
Infect. Dw. 158; 1 1.58. 

19. Saiki, XL IU BuRa^'an. T.U. Horn, <5-T M MuHxs, K.B. ^txd Krlfch 
H. A. 198G. AnalWs of cnzymatiodly amplified ami hTa* 
^163 T witfi aDele^petific oDgnn«deotide pn^. Nannx 

20. Koe*ru & l>ohctty, M- and Gitscbjcr; J. 1987. Aq W>ro»ed 
meoSod for prooataJ diagnosis of ecnak toes by «f 
ampKfied DNA StfO^ence*. N. Entfl. J- Med, 3 17. -985-9913. ^ 

21. Wu, D. Y«* UgerttoK, U, Pt%t, B- K. and Waftact; &. », 108$, 
speelfic cnznoVtic ampUGcation of fcgtobin geoomk DNA for 
ooib of sickle ceU ahetnia. Proc Natl Acad. 3d. USA 8&27s wtS" 

2S. Ktfok $, KcW, 3D. MeRimiey, Spas*. »„ Coda, Ll«S 
son t .C and StunsXf.J. J. 1990. tlEecU of pruiicr-tciaptrte matches 
on the norymerw; chain reaction: Human i^tmmoo^kienei- vin** 
trpe 1 modd studiei. Ackk Res. 18:999-1005, 

23, Cftou, HahjcO, M. Birch, Raymond; J. and ffioch, ]W. 1991 
rioeirtion of pre-PCR mi^-priraing and primer dimenajdofa mi. 
proves Unf-tnpy-numbcr arapiiecauoni; Subtmtttd. 

24. rtignchi, fL 1989. Using PCR to engineer DNA* p. 61-70. Th- pt» 
Technology, H. A Erlki (Ed^ Stocfcioh. press, New York, K.Y 

25, rfaff, L.» Atwood; J. 0^ i>Ce»TC, J, f Katz, Picotaa* *V \Wlnam. 
J. F. and WoncVtttcri, T, 1991. A ru^pe^brax^^J^^ 
automation of the polymerase chain reaction. Btotechninocs 1fexo9 

106-M2. ^ W " 

26. Tumoya, N. imd Xakm, 1989, Fluorcaccikl ElA screening of 
monodorud antibodies to cell stirfaoc ant%em. J. Imnt«a. jiitok 
!Vft59*4>3- ' ■ 




IMMUNO BIOLOGICAL LABORATORIES 



sCD-14 EUSA 

Trauma, Shock and Sepsis 




The CD-14 molecule is expressed on the surface of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor for lipopolysaccharide 
(LPS) complexed to U>S-8inding-Protein (LBP). Trie 
concenirailon of its soluble form is aftered under 
certain patlioiogical conditions. There is evidence for 
an Important rale of sCCM4.vvith pofytrauma, sepsis, 
burnings and inftamrnations. 
During septic conditions and acute infections ii seems 
to be a prognostic marker and is therefore of value in 
monitoring these patients. 



1BL offers an EUSA for quantitative determination of 

soluble CD-14 in human serum, -plasma, cell-culture 

supematants and other biological fluids. 

Assay features: 12x8 determinations 
(microliter strips), 
precoated with a specific 
monoclonal antibody, 
2x1 hour incubation, 
standard range: 3-96 ng/rrd 
detection limit 1 ng/ml 
CV: intra- and interassay < 8% 
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WelTve ci^ped a novel "real time" quantitative PCR 

«5» »r molecule delegation (at i»a five 
PCR is extremely accurate and less labor-intensive than airrenc quantitative PCR methods. . 



Quantitative nucleic acid sequence analysis has 
had an important role in many fields of biologi- 
cal research. Measurement of gcue expression 
(RNA) has b«et\ used extensively In monitoring 
biological responses lo various stimuli Clan el ai, 
1991; Huang et at. I995a,b; Prud'homme et al. 
1995). Quantitative gent? analysis (DNA) has 
Ix-cn used \v determine the genome quantity of a 
particular gene, as hi the case oft tic human HEK2 
gene, which is amplified , in -30% of breast tu- 
mors (Slamon et al. 1987). Cenc and genome 
quantitation (DNA and UNA) also have been used 
fur analysis of human immunodeficiency virus 
(IIJV) buTden demonstrating changes in the lev- 
els of vim* throughout the different phases of the 
disease (Connor et al. 1993; Phttak el al. J9v:*n; 
I'urtado et al. 1995). 

Many methods have heen described for the 
quantitative analysis of nucleic acid sequences 
(both for RNA and DNA; Southern Sharp et 
al. 1980; Thomas 1980). Recently, PCR has 
proven to be a powerful tool fOT quantitative 
nucleic acid analysis. PCR and reverse transcrip- 
tase (KT)-PCR have permitted the analysis of 
minimal starting quantities of nucleic acid (as 
little as one cell equivalent). This has made pos- 
sible many experiments that could not hnvo. been 
performed with traditional methods. Although 
PCR has provided a powerful tool, it is imperative 



that it be uaesl properly for quantitation (U««y^ 
maekro 1995). Many early reports of quantita- 
tive PCR and RT-PCR described quantitation of 
the PCR product but did not measure the initial 
target sequence quantity, It is essentia) to design 
proi>er controls for the quantitation of the initial 
target sequences (Pcrrc 1992; Clemcntl el al. 
100?.) 

RcNWifChcis have, developed several methods 
of quantitative PCR and RT-PCR. One approach 
measures PCR product quantity in the log phase 
( the reaction before the plateau (Kellogg et al. 



o 



1990; Pang ct a). 1990). This method requires 
that each sample has equal Input amounts of 
nucleic add and that each sample under analysis 
amplifies with identical efficiency up to the. point 
of quantitative analysis. A gene sequence (con- 
tained in all sample* at relatively constant quan- 
ta 4 ,**, such as p-aetln) can be us<jd for sample 
amplification efficiency normalization. Using 
conventional methods of PCR detection and 
quantitation (gel electrophoresis or plate capture 
hybridization), it is extremely lahorious to assure 
that all samples are analyzed during the log phase 
of the reaction (for bolh the target gene and the 
normalization gene). AnotJher method, quantita- 
tive competitive (QQ-ttCR, has been developed 
and is used widely for PCR quantitation. QC-PCR 
relics on the inclusion of an internal control 
competitor in each reaction (Bcckcr-Andrc 1991; 
Plata k et al. l.»3«,b). The efficiency of each re- 
action is normalized to the internal competitor. 
a irnnwn amount oi internal competitor can be 
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added to each sample. To obtain relative nuam- 
ration, the unknown target PGR product is ami* 
pared with the known competitor l»CU product. 
Success of ii quantitative competitive PCU assay 
relies on developing an Internal control thai am- 
plifies wlih the same efficiency as the la i gel mol- 
ecule. The design of the competitor and the vo)l- 
datlon of amplification efficiencies jequire a 
dedicated effort. However, because QC-IKIK dws 
not require I hat PGR ptoduCts be analysed during 
the log phase of the. amplification, it is the easier 
of I he two methods to use. 

Seveni I detection systems uie used for quan 
Utatiyc PCK and RT-PCtt analysis; (1) agarose 
gels, (2) fluorescent labeling of PGR products and 
clctccilon with l«i.i«r-indiK:ed fluorescence using 
capillary el cctruph orcaia (haseo ct al. 1995,* WIU 
JJams et ah 1996) or acrylamlde gels, and (3) plate 
capture and sandwich probe hybrid fy.ui Sun (Mul- 
der el ah 1994). Although these method* jmivnl 
successful, each metliod requires posl-PCR nia- 
nlpularlons That add thins to the analysis and 
may lead to JaboiaUny i niilau'itnation. The 
sample throughput of these method* Ln limited 
(with I he exception of the plate capture ap- 
proach), winl, therefore, these methods ore not 
well suited fiu u>o demanding high sample 
throughput (I.e., screening of large numbers of 
hlomwtttt. idea <>i aiudysdng Samples fox diagnu** 
lies or clinical trials). 

Metre we report the development of a novel 
assay for quantitative DNA analysis. The assay is 
based on the use. < if the 5' nuclease assay first 
described by Holland et al. (1993). The method 
uses ihe 5' nuclease activity of Tuq polymerase to 
cleave a noncxtcndJbtc hybridization probe dur- 
ing the extension phase of PCK- Tin: approach 
uses dual-labeled fluorogenic hybridisation 
probes (Lee et a). 1993; Mossier et al. 1993; UvaU 
cl al, 1$9Go,b). One fluorescent dye serves as a 
reporter |FAM (i.e., 6-carboxyfluoreseem)| and its 
emission spectra is quenched by the second fluo- 
rescent dye, TAMRA (I.e., o-carboxy-tetramethyl- 
rtiodaminc). Tlic nuclease degradation of the hy- 
hrkii/iittoii probe releases the quenching of the 
I 'AM fluorescent emission, resulting in an In- 
crease in peak fluorescent emission at 53 tt nin, 
The use Of a sequence detector (AUI Prism) allows 
measurement of fluorescent spectra of all 96 wells 
of the thermal cycler continuously during the 
1*CK amplification. Therefore, the reactions aje 
monitored in real lime. The output data is de- 
scribed and quantitative analysis of input target 
I )NA sequences is discussed below. 



REAL 1IML 0LIAIM1I1A1IV1 PCW 

RESULTS 

PCR Produce Derccrlon in R«al Time 

The goal was to develop a high-throughput, sen- 
sitive, and accurate gene quantitation assay for 
use In monitoring lipid mediated therapeutic 
gene delivery. A plasmid encoding human factor 
VHl gene sequence, pF8TM (see. Methods), was 
used as a model therapeutic gene. The assay uso< 
fluorescent Taqman methodology and an instru- 
ment capable of measuring fluorescence in real 
time (Abl Prism 7700 Sequence Detector). Hie 
Taqman reaction requires a hybridization probe 
lalxrled with two different fluorescent dyes. One 
dye is a reporter dye (PAM), the other Is a* quench- 
ing dye (TAMRA). When the pruliu is inlact, fluo- 
i esc en t energy transfer occurs and the reporter 
dye fluorescent emission is ubsorbed by the 
quenching dye (TAKfJIA). During the extension 
phase of the PCK cycle, the fluorescent hybrid- 
t/iilion pfot>c Ls cleaved by tlic S'-.T nuclcolytic 
activity of the DNA polymerase. On cleavage of 
the probe, the reporter dye emission Is no longer 
transferred efficiently to the quenching dye. re 
suiting lo cm increase of the reporter dye fluores- 
cent eint-tslon apectra. PCU primers and probe* 
were designed foi I he human fast or VHl se- 
quence and human p-actln gene (as described in 
Methods). Optimization reactions were per- 
formed to choose the appropriate probe unci 
magnesium concentrations yielding the highest 
Intensity of reporter fluorescent signal without 
sacrificing specificity. The Instrument uses a 
charge-coupled device (i.e., CCD camera) for 
measuring the fluorescent emission spectra fro;)) 
5,00 to r,$0 nm. Kach PGR tube was monitored 
sequentially for 25 ms«re with continuous moni- 
toring throughout the amplification, liach lube 
wa« re-examined every tt.5 see. Computer soft- 
ware was designed to examine the fluorescent In- 
tensity of both the reporter dye (PAM).and 
the quenching dye (TAMRA). The tluoresccnt 
intensity of the quenching dye, 'I^AMHA, changes 
very little over the course of the PCR ampllfl* 
cation (data not shown). Therefore, the Intensity 
of TAMllA dye emission serves as >m Internal 
tlaudard with which to norm alb*! the reporter 
dye (PAM) emission variations. The software" cal- 
culate? u value termed AKn (or AftQ) uaing the. 
following equation: Akn - (Iln J ) (R»')« where 
Kn 4 .. cmlmiluji iiUcjisity \>t reporter/emission in- 
tensity of quencher al any given time In a reae 
tion tube, and Rn r» emission intensitity of re- 
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poncr/cmlsslon *mcxu>ily "f quencher measured 
prior to TCK imiplilicatioii in ihat same reaction 
tube. Tor the purpose of quantitation, the \*s\ 
three data points (ARns) collected during the. ex- 
tension step for each K:k cycle w«rc analyzed- 
The micleolylic degradation of the. hyuildiy-ttion. 
probe occurs during the extension phase or it at, 
and, therefore, reporter fluorescent ciiimmuo In- 
creases during this time. Hie Uncc daw point* 
were averaged for eacJi k;K cycle and the mean 
value for each was plotted in an "amplification 
plot" shown In J'i«ure J A. The AKn mean value is 
plotted on the }*axJs, and time, represented by 
cycle number, is plot (fid on ttie*-axis. During this 
ear jy cycles of the VCR amplification, the ARn 



value remain?; at base line When sufficient hy- 
bridist Ion probe IwiS been cleaved by the Tmj 
ixjlymerasc nuflttMG Activity, the intensity of re- 
porter fluorc-accm emission lufmaavfr* Mftfil M*-tt 
amplin^tions reach » plateau phone of reporter 
fluojCKXitil cmifislon if the reauliun Is carried out 
lo high cycle iiumUeis. The a/npli Real Ion plot \'J 
examined euily in Hut reaction, ut a point IhAl 
icjjicsents ihv log phAW of product arnnnula- 
tion. This Js done by aligning an mbibaiy 
ihfeshoki thai is bu*cd on the variability of the 
tasm-Uiiv dMU. In Figure 1 A, the threshold was set 
ai 10 standard deviations above, the mean of 
Wafto line emission calculated from eyrie* 1 lo 1 S. 
Once the threshold is chosen, the point at which 
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Figure 1 PCR product detection in real time. {A) The Model 7700 vutlware will comimct ^P^j? 0 P 1 ^ 
from the extension phase fluorescent emission data collected during the PCR ■mpjfoun. The *n*£ de- 
viation is determined Irom the data points collected from the base line of the ******* ^ ^0 timS the 
calculated by determining the poinl ai which the fluorescence exceeds a thresh old I mil 
Standard deviation of the base line). (8) Overlay of amplification plots of serially (1 ^^J^SSc^S 
DNA samples amplified with p-actin primers. (Q Input DNA concentration of the samples Pj^™^ All 
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the amplification plot crosseo the threshold" is 
fined as C r . C r is reported us the cycle number ;« 
this point. Ar will be demon st rut ltd, the CI, .value 
Is pieUiuJve of the quantity of input target. 

Cr Values Provide a Quantitative Measurement, of 
Input Target Sequences 

Figure IB shows amplification plots of 3i»«di>Y*s*. 
ent PGR amplifications overlaid. The amplica- 
tions were performed on a 1:2 serial dilution »uR 
human genomic 1WA. 'line amplified target w:u. 
human p actln. The amplification ploft «hifl to 
the right (to higher threshold cycles) ns the. input 
target quantity is reduced, 'JThc is expected he?. 
exum nmctloriK with fewer Ktn rting copinx of t)ie 
target molecule require greater amplification to 
degrade enough probe to attain the Threshold 
fluorescence, An arbitrary threshold of 10 stan- 
dard deviations above the base line was used to 
determine the O r values. Figure 1C represents the 
C T values plotted versus the sample dilution 
value. Each dilution was amplified in triplicate 
PC:k amplifications and plotted as mean values 
with error bars representing one standard devia- 
tion. The Or values decrease linearly with increas- 
ing target quantity. Thus, G, valuta can be used 
as a quantitative measurement of the input target 
number. It should be noted that the amplifica- 
tion plot for the 15,6»ng sample shown In Figure 
1H does not reflect the same fluorescent rate of 
Increase exhibited by most of the other samples. 
The 15.6-ng sample also achieves endpoim pla- 
teau at a lower fluorescein value than would he 
expected based on the input DNA. This phenom- 
enon has been observed occasionally with other 
samples (data not shown) and may be attribut- 
able to late, cycle inhibition; this hypothesis is 
still under investigation. It is important to note 
that the flattened slope and early plateau do not 
impact significantly the calculated C, value us 
demonstrated by the Hi on Die line shown in 
Figure. 1 C All triplicate amplifications resulted in 
very similar Cr values— the standard deviation 
did not exceed 0.5 for any dilution. This experi- 
ment contains a > 1 00,000-fold range of Input tar- 
get molecules. Using C v values for quantitation 
permits a much larger assay range than directly 
using total fluorescent emission inlensily for 
quantitation. The linear range. ol lluoresccnl in- 
tensity measurement of the ABl Prism 7700 Se- 
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merits over n very large r;nij»<' of relative darting 
target quantities. 

Sample Preparation Validation 

Several parameters influence the efUcleury nf 
PC'.R amplification: magnesium and sail conceit: 
nations, reaction conditions (i.e., time and tem- 
perature), PCH target size and composition, 
primer sequences, and sample purity. All of The 
above (actors are common to a single PCR assay, 
except sample to sample purity, in an effort to 
validate Ihe. method of sample preparation for 
thciactor VJ11 assay, PCK amplification reproduc- 
ibility and efficiency ol 30 replicate sample 
prej wrations were examined. After genomic DNA 
was prepared from the 10 replicate samples, the 
DNA was quantified by ultraviolet spectroscopy. 
Amplifications were performed analyzing p-aciifl 
gene, content In 100 and 25 nj; of tutal genomic 
UNA. Each J'CK amplification was performed in 
triplicate. Comparison of C r values for each trip, 
licate sample show minimal variation based on 
standard deviation and coefficient of variance 
(Table 1). Iliercforc, each ol the triplicate VCM 
amplifications was highly reproducible, demon- 
strating that real time PGR using this instrumen- 
tation introduces minimal variation into thu 
quantitative J'CR analysis. Comparison of the 
mean Oi values of the 10 replicate sample prepa- 
rations also showed minimal variability, indicat- 
ing that each sample preparation yielded similar 
results for fi-aclin gene quantity. The highest 
tliffercnce between any of the samples was 0,85 
and 0.73 for the 100 and 25 ng samples, respec- 
tively. Additionally, the amplification of each 
sample, exhibited an equivalent rate of fluoro 
cent emission intensity change per amount of 
DNA target analyzed AS indicated by similar 
slopes derived from Ihc sample dilutions (Pig. 2). 
Any sample containing an excess of a PCX inhibi- 
tor would exhibit a greater measured 3-actin G r 
value for a given quantity of DNA. In addition, 
the inhibitor would be diluted along with the 
sample in the dilution analysis (H^, Z), altering 
the expected C r value change. Each sample am- 
plification yielded a similar result in the analysis, 
dcmonslrating that this method of sample prepa- 
ration is highly reproducible. wI1h regard to 
sample purity. 

Quantitative Analysis of a Plasmid After 
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TabU 1 . Roprodu<:lhlllty of S«mpl« Preparation Method 



Sampio 

no. 



1 

2 

3 

4 

5 



8 
9 
10 

Mean 



100 ng 



standard 
mean deviation 



CV 



18.24 

18.23 

13.33 

18.33 

1835 

1M4 

183 

18,3 

18,42 

18.15 

18.23 

18.32 

18.4 

18.38 

18.46 

18,54 

18.67 

19 

18.2B 

18.36 

18^2 

18.45 

18,7 

18,73 

18.18 

18.34 

18.26 

18,42 

18,57 

18.66 

0 10) 



1».27 
1837 



18.55 
18,12 



0.06 



0.06 



1834 0.07 



18.23 0.08 



1UM2 



18.7-1 0.24 



1839 0.12 



18.63 0.16 



18.29 0.1 



0.12 
0.17 



032 
03? 
036 
0.46 

0.23 
1.26 
0.66 
0.83 

QM 

0.65 
0,90 



20.48 

20.55 

20.$ 

20.61 

20.59 

20.41 

20.54 

20.6 

20.49 

20.48 

20.44 

2038 

20.68 

20.87 

20,63 

21.09 

21,04 

21.04 

20,67 

20,73 

20.6S 

20.96 
20.84 
20.75 
20,46 
20.54 
20.48 
20.79 
20.78 
20.62 



25 ng 



mean 



20»51 



20.54 



20.43 



20.86 
,20.51 

20.73 
20,66 



standard 
deviation 



0.03 
0.11 
0.06 
O.05 



20.73 0.13 



21.06 0.03 



20.68 0.04 



0.12 

0.07 

0,1 
0.19 



cv 



0.17 

0.54 

0,26 

0.26 

0.61 

0.15 

0.2 

0.57 

032 

0.16 
0,94 



(or containing a partial cDNA for human factor 
vill, pFgTM. A series of trarisfccUom was sot 
up using a decreasing amount of the plasmid x (40, 
A t 0.5, and O.l u.g). Twwilly-rour hours po.st- 
-trana feet Son, total DNA purified from each 
flask of crib. p-Actin gcncijuanUly mn>chuM;n c*n 
a value Toi* normali^iiuii of genomic DNA con- 
centration from each sample, lu tliis cxpeiifMuiit, 
p-actin gene content should remain constant 
relative to rural genomic DNA. Figure 3 shows the 
result of the p*-actln DNA measurement (100 ng 
total DNA determined hy ultra violet spectros- 
copy) Of each sunlit:. Kach sample was analyzed 
in triplicate and the mean (i-actin values of 
the triplicates were plotted (error bars represent 
^.nn ri«*wiaiioni l h#» ntoh#*ST niffcrcner 



bctwv^u any ivvn sample moans wax OMS C n Ten 
nanograms of totul UNA of each sample were also 
examined for p-actln. I'hc results again showed 
that very 3 i niUar amounts of genomic DNA were 
present; the maximum mean J* actio C, value 
difference wa.s 1.0. A3 Figure 3 shows, the rate of 
p-actlll C r change between the 100 and 10-ng 
sample* was simitar (slope values ra»g« T>«tween 
3.56 ijitd -3.45). This verifies again that ihis 
method of sample preparation yields rariplas of 
identical PCR integrity (i.e., no sample contained 
an excessive amount of a VCR Inhibitor). How* 
ever, these results indicate that each sample con 
talned slight differences in the actual amount of 
genomic DNA analysed. Determination of actual 
Kimumic ON A concent ration was accomplished 
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21 

<f a, 
| 19.5- 
10 

1fJ.b 





i 

1.4 



i 



i 

1^0 



I 

1-7 



iji lie * Z 
log (ng input genomic DMA) 

Figure 2 Sample preparation purity. 1 he repficato 
samples shown in Table 1 woro also amplified .In 
tripicate Rising 25 119 of each ON A sample. The fig* 
u\k show* die input DNA concentration (TOO and 
25 ng) vs. C, In ihi* lipurp, ih*» 1O0 and 75 ng 
points for «ach sample are connected by a line. 



M 



by plotting the mean fs-actio 0, value obtained 
for eaeli 1(H) IifcJ sample tin £-acthi standard 
tanve (shown In J'$H- 4< ^>- actual guiiumlc 
DNA concentration <>f each swmpl«, was ob 
talncd by extrapolation to thu Xaxli, 

Figure 4 A shows the measured (t.«. f «u>n* 
nOfmQlJ7i«d) quantities «f factor VI)1 plamnld 
DNA (pPSTM) from each of the four transient cell 
IniHsfeciirms. Each reaction contained J00 nff of 
total sample. DNA (as determined by UV spectro* 
copy)- 1? ^ch sample was analyzed in triplicate 



V-27J3' •MMHIt.l 
y-a*.?r» -s,5ax P»1 
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d" 


pa. 


as 


23 


CO. 






22- 




21 




20- 




pFflJM transfcdod 

— -• • 0 J no 
. A 0.1 HQ 



M UJ 1.8 
log (ng input DNA) 

Figure 3 Analysis of tiansfectcd ceil DNA quantity 
and purtly- I he DNA preparations of the four 293 
cell transactions (40, 4, 0.5, and 0,1 ng of pF8TM) 
were analysed for the p-actln gene. 100 and 1 0 ng 
(determined by ultraviolet spectroscopy) of each 
sample were amplified in triplicate. For each 
amount of pF8TM that was transfected, the (i-aciln 
C T values are plotted versus the total Input DNA 



pc:r< ;rtiiplification». As shown, pl*'8TM purified 
June Jhc 20H cells Unerases (mean C, values in- 
CTU:t*£; with decreasing amounts of plasuild 
,mutsU'Ucd. The mean C L values obtained for 
prWM in TigufC 4A were plotted on a standard 
curve comprised of seilally diluted pFHTM, 
shown .in figure 4R. The quantity uJ plfflTM, 
found in each of the four tranfifcctlons was de 
tcrmined by extrapolation to the x axJ* of the 
standard curve In l'igurc 4R. Tht*c. uncorrected 
values, b, for pKKTM were iiuriuallxud to deter- 
mine the actual amount of pi'8TM found per 100 
n\\ of genomic DNA by using the equation:. 

ly x 100 rift actual pl-frfM cor >ies i>er 
~ r 100 ng of genomic DNA 

where a actual -genomic DNA in a sample and 
Jji-pRiTM copies from the standard curve. The 
normolir-cd quantity of pt'BTM pec 100 n$ of ge- 
nomic DNA for each of the four iran.Vfec.ilons Is 
.shown In Figure 4Ji. Then: roull* show iHai the 
quantity of factor vin plasiuiU associated wiih 
the 29.1 cells, 2* lir after irwusfectitiu, dtu.iitaScs 
with decreasing; pjasiuid cuui.«niiatJou used in 
the transection. 'Hie quantity of pl-'UTM associ- 
ated with 293 cells, after transection with 40 
Of plasmid, was 35 pg per 100 ng Rcnomlc DNA. 
Tills results in -520 plasuiid copies per eeJJ. 



DISCUSSION 

We have described a new method for quantitnt- 
in^ gene copy numbers using real-time analysis' 
of PCR ampHficationx. ReaUtlmc FCH is compat- 
ible with cither of the two PGR (KT-PCR) ap- 
proaeho: (1) quanlllativc cumfjcihivi: where an 
Internal wmpcllLctf for each target sequence is 
used for norniidifcadon (data not shown) or (2) 
quantitative comparative PCH uslny a iiomudiza- 
tlon wnv contained within the sample (i.e., |3-ac- 
rjn) or a /y houseket^|J^ng' , gene for RT-HCK. Ff 
equal amounts of nucleic add are analy/cd for 
each sample and if the amplification efficiency 
before quantitative analysis is identical for each 
sample, the Internal cojiliul (nuj-malij^itiou jjene 
or competitor) should j;ivc equal steals for alJ 
samples. 

The real-time PCU method (offers several ad- 
vaiilaftcs over the other two methods currently 
employed (sec the Introduction). Mrsl, the real- 
time PCR method is performed in a dosed-tube 
system and requires no pcwt-PCR mariipulatlon 
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Figure A Quantitative flnolysix of pF8TM in transfcctcd cells. (/4) Amount of 
plasmid DMA used for the trunsfeciion plotted against the mean C, value deter- 
mined for pfSTM rcmainino, £>f br alter transection. (0/Q Standard curves of 
pPRTM and £-a<t!n, respectively. pf&TM ONA (0) and genomic. PNA (Q were 
diluted £ Arblty 1 ;S before Amplification with the. appropriate primers. The p-aclin 
standard curvo wy* used to normalise the results of A to 1 00 mjj of genomic DNA. 
<0) The amount of pPSTM present per 1 00 ng of genomic DNA. 



of sample. Therefore, <h«* potential for TCH con- 
tamination in the laboratory is reduced because 
Amplified products can he. analysed and disponed 
of without opening the reaction tubes. Second, 
this method suppoxU the. umi of a normal Nation 
gene (i.e., f3-actin) for quantitative. PCR or house- 
keeping genes for quantitative RT-1'CK controls. 
Analysis Is performed in real time during the Jog 
phase of product accumulation. Analysis during 
K»k phase permits many different genes (over a 
wide input target range) lo be analyzed simulia- 
m:ou?jJy, without concern of reaching reaction 
plateau at different cycles, Tim will make mull I* 
gcn« analysis assays much caMw lv develop, be- 
cause individual internal coifipelllois will not be 
needed for each gene under analysis. Third, 
sample throughput will imicasc dramatically 
with the new method because, there is no |>ost« 
rCK processing lime Additionally, wen king In a 
"6-wcll format to highly compatible with auto- 
mation technology. 

The real-time 1>CR method is highly repro- 
ducible. Rep) leal <* amplifications can be analysed 



for each sample minimising |>otcntlal error. The. 
system allows for a very large assay dynamic 
range (approaching l,0O0,0O0-fold starting Ui- 
gel). Using u standard curve for the. target oi in- 
terest, relative copy number values can be deter- 
mined for any unknown sample, fluorescent 
threshold values, G r , conelair. linearly with rela- 
tive PNA copy numbers. Heal time quantitative 
H'M'Ot methodology (O'lbson et al„ this l.wu(t) 
has also been developed, finally, real lime quan- 
titative I*CU methodology can be used In develop 
high-throughput screening assays for a variety of 
applications f quantitative gene capjeasiun (RT- 
rOtyj gene copy a.-inoys (I1cr2, 1IIV, etc.), .gcnr> 
typing (knockout mouse analysis), and Immuno- 

ponj. • 

Real-time VCAl may also be jwforrncd using 
intercalating dyes (Hlguchi ct ul. such as 

clJiJdium bromide. The fluorogenic probe 
method offers a mafor advantage over inter- 
calating dyes- greater specificity (i.e., primer 
dimvrs and nonspecific PCR products are not de- 
tracted). 
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METHODS 

Generation of <t PJasmid Containing a Partial 
cDNA for Human Factor VIU 

Total RNA w<u harvested (UNA*ol K from T<«1 Tctt, Inc., 
tncjulswood, TX) from ce)l> it«it*fected with a factor VIU 
t'X]>rt»*Jutt vector, pC:|SZ.tk^U (Koton el al. WHO; Got*, 
man ci al. 19tX». A factor VIII partial cMNA vvpieiKV WAS 
rtom.micd by in* l*:U itUmoAmp KX (Tth UNA l*ni Xh 
(pan NoOK-or/y, Applied Hiosywcms I : oskm aty, 

Uilng Ok- l*:K priuiers FBfor *••«! l*«:v (prin««-r sequeiim 
are shown btflow). The ampllcon was reaniplined liMnj; 
modified 1'ofor and Wrcv primers (»ppciuk<d will) tamlll 
and rYwdlll restriction 5ltc sequences »t thv V tfjul) and 
Clonal Into ptiKM- 3Z (Ptwik^u tkwp.. MutlisOii, Wt>. The 
resulting clone, ijP&TM, was utcd lor transient transection 
oJ' 2.93 ccJIa. 

Amplification of Target DNA ami Dulctilon of 
Amplkon Factor VHi Plasmid DNA 

(prHTM) was «in|>llfl«il wUU the p»iim-is IWoi S f -C<;c;- 
<ri'(l(XWAUAU:JXiAI!iilCn'C-3' and l : »rcv 5'-AAA<:C7]*- 
t^CCXTJX^GA'JXitiTAOCJ-aMlMS nnivllmi produced « 422- 
lip product. The forward priirivr wua denned luiw 

ognlxc U unique Minnim* ftmud lf< the 5 1 untranslated 

region of the patent pCla2.oc£5J> pI<jmu'«I mid- thufcforc 
does nut k'VwhijUc <ind amplify the human factor VI II 
gem% l*rimnr* wore chosen wth ihv awittawf* of I he com- 
I>ulcr program Oliflo 1.0 (National Uiwscionccs, lnc„ lly- 
mouth, MN). The human P-acttn gene was ampUAcd with 
lUc pitmen 0-prlifi iWwnrU primer .V-TCAOCOAclA* AW 
GCCCATO'AOOA-.V and fi-actin reverse piimcr .S'-CAfJ. 
C0CAACCO(rr<:AnciC:c.AAiGG-3'. The reaction pro- 
duced a hp pOU product. 

Amplification reactions (SO uJ) runtaifivd a ONA 
sample, 10 x PCH Buffer U (a 200 jtM UA1V, dClT, 
dGTP, and 400 \M riUTI\ 4 tnw MgC!?, 1.2.S Units Ampll 
7«</ r;NA poiymcuwc, 0,5 unit Anipnrasc uracil /v-jjiy- 
ckmvIimv <UNC), 50 pmolc of each faciei Vlll jvlnwi, and 15 
j,h i !<>!*• <>f audi |< actio pdmw. 'Hut • taction*. <i1m> contained 
Otic Of the fp|lo^l«K di'1CCtl«u prnhox (UIO nu rnrli): 

i't»j»rt.bo ifc'(irAM)Ac:CT , crit:cu(:<rwifn , i , cri , rrcnt] , r- 

GCCTT(TAMRA)p J' and p-ttt-tin probe 5' (TAM)ATCJCt-C:- 
XCTAMRA)CCCCr^TCCCATC|v.l' wl.crc p indicates 
pho.^phorylAtion nt\d X indlcfttcs a linker arm nucleotide. 
Reaction IuIk*? ww Mit:n?An\p Optical Tubes (part AUm- 
bvrNkOI OO.ia, rcrkln Ulnier) iliat wore /routed («t IH'ri;ln 
Elmer) to prevent UgUl from re fleeting, Tube caps were 
similar tn MiemAnip t;np:» IhiI specially desiftned to pre- 
Yeni U^ht seaHennjj. All <il * 1 IK'M ^»ti«wi«ubU'* wcro *\r|> 
r l;cd l>y Pi: Applied lWo^tema {|^i*lor TlWy, CA) except 
the factor Vlil primers, which weir .tynthcsUed «l Cenen 
lech, Inc. (South « rrunciseo, CA). Probev wort' designed 
using the Olifco 4.0 software, following guideline 

gC5ie<i in tnc Model 7700 .sequence interior In.nnuuefil 
manual, hrtcfly, probe T m slmuld he al leost 5 W C hlfthcr 
man XU* aruu-Mlliijc tenipi'Mhirc used durhifi ihrrmul cy- 
rlntg; pftmcrs should not Xuiih stihlv duplexed wiili the 
probe. 

The thcriu<il fycllng canditUuvs Included 2 jnln ai 
5U V C and 10 mil i at 95"C. Ilianiuil cycling proceeded with 



reactions were performed hi the Model 7700,Sequcna' IV- 
tcHlor ApphecJ Uiusyvtuuiv), which coiU»Un » 0c««- 
Amp i»< ;K Systwm VOOO. Uc : a<:ilon wndition^ w*-re pro* 
gratimtcd wn -i l'ww«r Macint<»h V100 (Apple C^mpntor, 
Ronta Claru, t^\) linked dirvtiiy to the Model 77M .Sr- 
uuenev IXdeclor. Ana'y*»» *>f d»U wa« »l«o perform*^ on 
the MHt'intr«ih compviter. CVklloeilaii and analydfi «>fiw:*ro 
wt»s developed *t in% Applied WcKyftlums. 

Transection of CclU with Factor VIH Conriruci 

Vnur I17.S flasks of 293 cells {A'VCX: OU1. 15714), ?i human 
fetol kidney suspension cell line, were grown to 80% con- 
Muency and transfcded pIVJ-M. Cells were grown In tliv 
follnwhig rncdlnt S0% HAM'X V\2 without GUT, 50% U»w 
glucose JXdtKivn # a imjdlftrd Koxio mcdiuni (UMHM) wlUi* 
om glycine wiUi sodium bicarhtmate, J0% ictal bovine 
scrum, 2 imm u^iuldtnino and 1% pcnicillin-alrcptomy^ 
0o. The media wa» dwnged 30 min heftw iranaTcc 
lion. plOTM DNA amounts of 40, ^, 0«V, and 0.1 ^; were 
aclited u> 1.^ ml of a solution contalnlnR 0.125 m CmO* 
ft nd 1 x IIKMS. The four mixtures were left at room twn- 
IKTvUin- tut 1U mln ai>d then added dntpwUe u> il*e ceils, 
Thv n«>k* m.«e StiuiUilcd at 37°C and $«. CO. for 24 hr, 
washed with PUS, a«d r*xwspcndcd In PliS. The miiN 
jM-ndud cells were divided into alupiots mid UNA was ev- 
tmcstcd hiii iicdiulcly usini; IhvQIAamp Kit (Qiagen. 
Chalawurtlii C*A), l>NA won ctluled Into 200 j*l of 30 
Trls-lia at pll B.0. 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, WISP-1 and WISP-2, that are up-regulated in the 
mouse mammary epithelial cell line C57MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
WISP-3, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (r ) C57MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt- 1 under the control of a 
tetracylihe repressible promoter, and (it) Wnt-1 transgenic 
mice. The WISP-1 gene was localized to human chromosome 
8q24.1-8q24J. WISP-1 genomic DNA was amplified in colon 
cancer cell lines and in human colon tumors and its RNA 
overex pressed (2- to > 30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISP-3 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to >40-fold) in 63% of the colon tumors analyzed. 
In contrast, WISP-2 mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2- to > 30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitutively active glycogen 
synthase kinase-3/3 (GSK-3/3) resulting in an increase in 
/3-catenin levels. Stabilized /3-catenin interacts with the tran- 
scription factor TCF/Lefl, forming a complex that appears in 
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the nucleus and- binds TCF/Lefl target DNA elements to 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
/3-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
to 0-catenin, and facilitates its degradation. Mutations in 
either APC or /3-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3, a member of 
the transforming growth factor (TGF)-/3 superfamily, and the 
homeobox genes, engrailed, goosecoid, twin (Xtwn), and siamois 
(2). A recent report also identifies c-ahvc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and refractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-1 
and WISP-2, and a third related gene, WISP-3. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
. Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF100777, 
AF100778, AF100779, AF100780, and AF100781). 
tTo whom reprint requests should be addressed, e-mail: diane@gene. 
com. 



14717 



14718 Cell Biology, Medical Sciences: Pennica et al 

cDNA was synthesized from 2 u.g of poly(A)* RNA isolated 
from the C57MG/Wnt-1 cell line and driver cDNA from 2 p.g 
of poly(A) + RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WlSP-1 were isolated by screening a AgtlO mouse 
embryo cDNA library (CLONTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WISP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISP-3 were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA, PCR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 jaM of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization. 33 P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP-2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-ampIified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2<*ct) where ACt represents the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
d-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The WISP- specific signal was 
normalized to that of the glyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-1 and WISP-2 by SSH. To identify Wnt- 
1-inducible genes, we used the technique of SSH using the 
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mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/ Wnt-1 cells. 

Two of the cDNAs, WISP-1 and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG ceils and has no 
effect on 0-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP-1 were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of -40,000 (M T 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. 24). 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of -27,000 (M r 27 K) (Fig. IB): Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 
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Fig. I. WISP-1 and WISP-2 are induced by Wnt-l, but not Wnt-4, 
expression in C57MG cells. Northern analysis of J*7SP-i (A) and 
WISP-2 (B) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells. Poly(A) + RNA (2 /xg) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse WISP- 1 -specific probe 
(amino acids 278-300) or a 190-bp WISP- 2- specific probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human 0-actin probe. 
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Fig. 2. Encoded amino acid sequence alignment of mouse and 
human WISP-1 (A) and mouse and human WISP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP- 1. 

Identification of WISPS. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISPS cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354-aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-linked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
" human WISP proteins shows that WISP-1 and WISP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-1 and 32% identity with WISP-3 (Fig. 3A). 

WJSPs Are Homologous to the CTGF Family of Proteins. 
Human WISP-1, WISP-2, and WISPS are novel sequences; 
however, mouse WISP- 1 is the same as the recently identified 
Elm I gene. Elm I is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-I (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov. CTGF is a chemotactic and mitogen ic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-/3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, similarity to Wnt-1. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 35) (21). The N-terminal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (IGF)- 
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Fig. 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WISP-2 that are not 
present in WISP-3 are indicated with a dot. (B) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described, to date but is 
absent in WISP-2 (Fig. 3 A and B). The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown).. 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WlSP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISPS was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISPS 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP-2. Expression of 
WISP-1 and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-l expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 




Fig. 4. (A, C, £, and G) Representative hematoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power (A and £), 
expression of WISP-1 is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and D), and tumor cells are negative. 
Focal expression of WISP-1, however, was observed in tumor cells in 
some areas. Images of W7SP-2 expression are shown in E-H. At low 
power (E and /^..expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 
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the predominant cell type expressing WISP-1 was the stromal 
Fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISPS mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig- 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc , we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr ceil lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-1 and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors (P < 0.001 for each). The 
copy number for WISPS was indistinguishable from one (P = 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 (P < 0.001), 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 




Fig. 5. Amplification of WISP- 1 genomic DNA in colon cancer cell 
lines. (A) Amplification in cell line DNA was determined by quanti- 
tative PCR. (B) Southern blots containing genomic DNA (10 /ig) 
digested with EcoRl (WISP-1) or Xbal (c-myc) were hybridized with 
a 100-bp human WISP-l probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of WISP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means ± SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1, WISP-3 RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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Fig. 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 



mucosa. The amount of overexpression of WISP-3 ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WISP-1, WISP-2, and WISP-3, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyiine-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., /3-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-l-transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through 0-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and WISP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such as TGF-/3, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin a v 03 serves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-/31, which is the stimulus for 
stromal proliferation (34). TGF-/31 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Wnt-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply. WISP- 1 and 
WISP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and WISP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP-1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
WISP-3 RNA was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WISP<2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon, 

A recent manuscript on rCop-1, the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and /3-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic p-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. . 
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Wc have developed a novel "real time" quantitative PCR method. The method measures PCR product 
accumulation through a duaMabeltd fhiorosenle probe TaqMan Probe). This method provides very 
accurate and reproducible quantitation of gene copies. Unlike odter quantitative PCR methods, real-time PCR 
does nor require post-PCR sample handling; preventing potential PCR product carry-over contamination and 
resulting In much faster and higher throughput assays. The real-time PCR method has a very large dynamic 
range of starting target molecule determination (at least five orders of magnitude). Real-lime quantitative 
PCR is extremely accurate and less-labor-intensive than current quantitative PCR methods. 



Quantitative nucleic acid sequence analysis has 
had an important role in many fields of biologi- 
cal research. Mcasuieme.nl of gcu* expression 
(RNA) has been used extensively In monitoring 
biological responses to various stimuli Clan et al. 
1991; Huhiir el al. 1995a,b; Prud'homme ct. al. 
1995), Quantitative gene analysis (DNA) has 
lH-en used to determine the jjenomc quantity of «i 
particular gene, as in the case of the human HKK2 
gene, which Is amplified in -30% of breast tu- 
mors (Slarnon el al. 1987). Gene and genome 
quantitation (DNA and UNA) also have been used 
for analysis of human immunodeficiency virus 
(IIJV) burden demonstrating changes in the lev- 
els of virus throughout the different phases of the 
disease (Connor e.t al. 1993; Platak ct al. jvv:*to; 
Purtado et al. 199S). 

Many methods have been described for the. 
quantitative analysis ot nucleic acid sequences 
(both for RNA and DNA; Southern 1 V/6; Sharp et 
al. 19K0; Thomas 1980). Recently, PCR has 
proven to be a powerful tool for quantitative 
nucleic acid analysis. PCR and reverse transcrip- 
tase (RT)-PCR have permitted the analysis of 
minimal starting quantities of nucleic acid (as 
little as one cell equivalent). This has made pos- 
sible many experiments that could not have been 
performed with traditional methods. Although 
PCR has provided a powerful tool, it is imperative 



2 C*>**<*0 Gliding *uthnr. 

— • •» # • « m* «. m • « ^ • ^ 

TO® 



that it be U5cd properly for quantitation (U»uy- 
maekers 1995). Many early reports of quantita- 
tive PCR and in -PCR described quantitation of 
the PCR product but did not measure the Initial 
target sequence quantity, II is essentia) to design 
proper controls for the quantitation of the initial 
target sequences (Pcrrc 1992; Clementl et al. 

Researchers have developed several methods 
of quantitative PCR and KT-PCR. One approach 
measures PCR product quantity in the log phase 
of the reaction before the plateau (Kellogg et al. 
1990; Pang et it). 1990). This method requires 
that each sample has equal input amounts of 
nucleic add and that each sample under analysis 
amplifies with identical efficiency up to the point 
of quantitative analysis. A gene sequence (con- 
tained in all samples at relatively constant quan- 
tities, such as p-actln) can be used for sample, 
amplification efficiency normalization. Usiny 
conventional methods of PCR detection and 
quantitation (gel electrophoresis or plate capture 
hybridization), it is extremely laborious to assure 
that all samples are analyzed during the log phase 
of the reaction (for both the taTget gene and the 
normalization gene), Another method, quantita- 
tive competitive (QQ-KCR, has been developed 
and is used widely for PCR quantitation. QC-PCR 
relics on the inclusion of an internal control 
competitor in each reaction (Becker-Andre 1991; 
Plata k el al. 1993*,b). The efficiency of each re. 
action is normalkcd to the internal compeUior. 
a imnwn amount of lnte.maJ competitor ean be 
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added to each sample. To obtain rotative nmwh 
ration, the unknown target PGR product is com- 
pared witli the known com petit or l*CK product. 
Success of a quantitative competitive VCU assay 
relies on developing an internal control Uini am- 
jiliftrc with the same efficiency as the uugut mol« 
ccule. The design of the compctltoi ami the val- 
uation of amplification efficiencies jequire a 
dedicated effort. However, because QCMKIK does 
not require that PCSR-jmikIucIs be awlyxwi during 
the log phase of the amplification, it is tint easier 
ihe two methods to use. 
Several detection system* uiv used for quan 
lltative PCK and RT-PCK analysis: (1) agarose 
gels, (2) fluorescent labeling of PCK products and 
detection with Inser-induced fluorescence using 
capillary cktctrophoresis (hasco et ah 1995; Wil- 
liams ei ah 1996) or acrylaxuldv gels, and (3) plaie 
capture and sandwich probe hybridization (Mul- 
der el ah 1994). Although these method* proved 
Successful; each method requires post-PCR ma- 
nipulations that acid rime; to the analysis and 
may lead to laboiatoiy loiilauimiation. The 
sample thru ugh put of these methods is limited 
(with the exception of the plate capture ap- 
proach), ami, therefore, these methods ok- not 
well suited foj u>es demanding high sample 
throughput (i.e., screening of large numbers of 

bli>liit?lev.uhr:k ui uiialy/.ln^ SAmplea fui diagilub* 
Uw or clinical lri«il s s). 

Here wc report the: development of a novel 
assay for quantitative DNA analysis. The assay is 
hased on Ihe use of ihe 5*' nuclease assay first 
described by Holland et al. (1993;. The method 
uses the 5' nuclease. activity of 7Yi</ polymerase to 
cleave a noncxtcndlblc hybridization probe dur- 
ing the extension phase of PCK. The approach 
uses dual-labcJed fluorogcnic Jiybridi/.at ion 
probes (Lcc. ct a). li>93; Uasslcr ct ah 1993; Uvak 
ct al, 1996a,b). One fluorescent dyv serves as a 
reporter |FAM (i.e., tf-carboxyfluoresvein)[ and Hs 
emission spectra is quenched by the second fluo- 
rescent dye., TAMRA (he., 6-carboxy-ietramethyl- 
rhodaminc). The nuclease degradation of the hy- 
hrldl/Jitlon probe releases the quenching of tile 
I 'AM fluorescent emission, resulting in an In- 
crease hi peak fluorescent emission at 5Jo run. 
The use Of a sequence detector (A13I Prism) allows 
measurement of fluorescent spectra of all 90 wells 
of the thermal cycler continuously during the 
PCK amplification. Therefore, the relictions aje 
monitored m real lime. The- output data is de- 
scribed and quantitative analysis of input target 
I )NA sequences is discussed below. 
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RESULTS 

PCR Product Derectlon in R«<*l Time 

"lite goal was to develop a high-throughput, sen- 
sitive, und Accurate gene quant hat Ion assay for 
use In monitoring lipid mediated therapeutic 
gene delivery. A plasmkl unending human factor 
Vlil gene sequence, pi>8TM (see. Methods), was 
used as a model therapeutic gene. The assay usee 
fluorescent Taqmun methodology and an instru- 
ment capable of measuring fluorescence in real 
time (AM Prism 7700 Sequence Deicclnr). The 
Taqman reaction requires a hybridization prohe 
lal>cled with two different fluorescent dyes. One 
dye is a reporter dye (1«'AM), the other is * quench- 
ing dye (TAMRA). When the proln: is jolact, fluo- 
icsccnt energy transfer occurs and the reporter 
dye fluorescent emission is absorbed by the 
quenching dye (TAMRA). During Die extension 
phase of the PCK cycle, the fluorescent hylirid- 
l/^lion prone K cleaved by the 5'-T nucleolytic 
activity of the DNA polymerase. On cleavage of 
the probe, the reporter dye emission is no longer 
transferred efficiently to the quenching dye, re 
Ml I til ik bi an increase of the reporter dye fluores- 
cent enns-iloii »p*ctra, PCR primers ""d prubuH 
were designed foi I he human factor VJ1J se- 
quence and human p-actln gene (as described in 
Methods), Optimization reactions were per- 
formed to choose the approprlute probe uml 
magnesium concentrations yielding the hi^est 
Intemiiy of reporter fluorescent signal without 
sacrificing specificity. The Instrument uses a 
chftt'Ki>coijplc.d device (i.e., CCD camera) for 
measuring the fluorescent emission spectra from 
500 to r«50 mti. Kach PCR tube was monitored 
sequentially for 26 msec with continuous moni- 
toring throughout tlit: amplification. Each lube 
wa.% rr-cxamlncd every B.5 sec. Computer soft- 
ware. w;i?i designed to examine the fluorescent In- 
tensity of both the reporter dye (FAM).and 
the quenching dye (TAMIIA). The fluorescent 
intensity of the quenching dye, TAMUA, changes 
very Utile over the course of the PCR amplifi- 
cation (data not shown). Therefore*., the Intensity 
of TAMKA dye omission serves as an internal 
standard with which to iiurfitiritatt the reporter 
dyv. (FAM) emission variations. The software cal- 
culous a value termed ARn (or AKQ) using the 
following equation: ARn - (Un* 1 ) (R"")* where 
Kn 4 . ernlrfHlon intcjishy of reporter/emission in- 
tensity of quencher at any given time In o roue 
don tube, and Ru r- emission intemsility of re- 
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porter/emission nitemily i>f quencher measured 
prior to TCK ampliiicatioii in ih;ir same reaction 
tube. For the purpose of quantitation, the last 
three data | Joints (AKns) collected during The ex* 
tension step for each 1 J CK cycle were analyzed. 
The nucleoiytic degradation of the. liyuildixaiion 
probe occurs (luring the extension phase of I'tat, 
and, therefore, reporter fluorescent cumMon in- 
creases; during this ilmc. Hie ihice data points 
were averaged for each KJK cycle and the mean 
value for each was plotted in an "amplification 
plot" shown In J'itfurC j a. The AKn mean value is 
plotted on the )*axi$, and Time, represented by 
cycic number, is plod ad on ttjv.*-axis. During the 
early cycles of Hie T'CU amplification, the.ARn 



value remains at base Jhie When Kufficlenl hy- 
bridi/atlon probe has boon cleaved by the TiU) 
jx>iymerase nufltttftG Activity, the intensity of re- 
porter fluorescent emission increase*. Most \>CAs 
ainplifivMions reach it plateau phone of reporter 
fluorescent cmifision if the reautiun is carried out 
to high cycle uuji»Ih:i*- The amplifiVaUon plot h 
examined vuily in the reaction, at a point thai 
icpresents ilw log phase of producl amwiula* 
lion- This is done by assigning an arbitrary 
threshold thai is based on the variability of the 
base-line data. In Figure 1 A, the threshold whs set 
at 10 standard deviaUoiiN above, the mean of 
base line emission t:alculated from iydea 1 lo 1 fv 
Once the threshold is chosen, the point at which 
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Figure 1 PCR product detection in real time. {A) The Model 7700 Mjftware will construct amplificatipn plots 
from the extension phase fluorescent emission data collected during the PCR amplification. The standard de- 
viation is determined Irom the data points collected from the base line of the amplification plot c, values are 
calculated by determining the point at which the fluorescence exceeds a threshold llmil (usually 10 times the 
Standard deviation of the base line), (B) Overlay ot amplification plots of serially (1 :2) diluted human genomic 
DNA samples amplified with E-actin primers. (Q Input DNA concentration of the samples plotted versus Cj. Ail 
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the amplification plot crossed the threshold* vvcltf. 
fined as C r C, is rqxutecJ us the cycle number u\ 
tlilK point. Ar will be demonstrated, the C, .value 
Is piedictlve of ihe quantity of input target. 

Values Provide a Quantitative Measurement* o* 
Input Target Sequences 

Plgurc 1B shows amplification plot* of li»'di>T«5»f- 

ent PGR amplifications overlaid, 'l*he amplifica- 
tions were performed on a 1:2 serial dilution •«« 
human genomic DNA. ilic amplified targei v*a* 
human p octln, The amplification plofr wliifl to 
the right (to higher threshold cycles) ns the input 
target quantity i <i roducod. '11m is expected ho 
mum KiHCtloriK with fewer starting mpitui of the 
target molecule require greater amplification to 
degrade enough probe to a train the Threshold 
fluorescence, An arbitrary threshold of 10 stan- 
dard deviations above the base line was used to 
determine the O r values. Figure 1C represents the 
C r values plotted versus the sample dilution 
value, Each dilution was amplified in triplicate 
PC !R amplifications and plotted as mean values 
with error bars representing one standard devia- 
tion. The Cr values decrease linearly with increas- 
ing target quantity. Thus, G r values can be used 
as a quantitative measurement of the input target 
number. It should be nolcd that the amplifica- 
tion plol for the 15,6>ng sample shown in Mgure 
IB docs not re.fle.ct the same fluorescent rate of 
increase exhibited by most of the other samples. 
The 15.6-ng sample also achieves e.ndnoint pla- 
teau at a lower fluorescent value than wf>ulcl he 
expected based on the input 1>NA. This phenom* 
cnon has been observed, occasionally with other 
samples (data not shown) and may be attribut- 
able to late cycle inhibition; this hypothesis is 
still under investigation. It is important to note 
that the flattened slope and early plateau do not 
impact significantly the calculated O, value as 
demonstrated by the ill on the line shown In 
Figure 1C, All triplicate amplifications resulted in 
very similar Cr values— the standard deviation 
did not exceed 0.5 for any dilution, tills experi- 
ment contains a >1 00,000-fold range of input tar- 
get molecules. Using C v values for quantitation 
permits a much larger assay range than directly 
using total fluorescent emission intensity for 
quantitation. The linear range ol lluorcsccm in- 
tensity measurement of the ABI Prism 7700 &e- 

90RM 
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merit* over a very large r;oij»<» of relative starting 
target quantities. 

Sample Preparation Validation 

Several parameters influence the chMclenry nf 
PC;r amplification: magnesium and salt concen- 
trations, reaction conditions (i.e., time and tem- 
perature), PCU target size and composition, 
primer sequences, and sample purity. All of The 
above (actors are common to a single PCK assay, 
except sample to sample purity, in an effort to 
validate the method of sample preparation for 
the iacior Vill assay. PCK amplification reproduc- 
ibility and eJfieJcncy 01 10 replicate sample 
pre] vi rat ions were examined. After genomic DNA 
was prepared from the 10 replicate samples, the 
DNA was quantlialcd hy ultraviolet spectroscopy, 
Amplifications were performed analyzing p-aciln 
gem: content In 100 and 25 ng of total genomic 
DNA. Each VCR amplification was performed in 
triplicate. Comparison of C,< values for each trip, 
licate sample show minimal variation hased on 
standard deviation and coefficient of variance 
(Table 1). Therefore, each ol the triplicate PCU 
amplifications was highly reproducible, demon- 
strating that real time PCK using this instrumen- 
tation introduces minimal variation Into the 
quantitative PCK analysis. Comparison of tiie 
mean V n values of the 10 replicate sample prepa- 
rations also showed minimal variability, indicat- 
ing that each sample preparation yielded similar 
results for f-t-actln gene quantity. The highest C T 
difference between any of rlie samples was 0.f>5 
and 0,73 for the 100 and 25 ng samples, respec- 
tively. Additionally, the amplification of each 
sample exhibited an equivalent rate of fluo/cv 
cent emission intensity change per amount of 
DNA target analyzed as indicated by similar 
slopes derived from (he sample diiuiions (Pig, 2). 
Any sample containing an excess of a PCK inhibi- 
tor would exhibit a greater measured (3-actln C r 
value for a given quantity of DNA. in addition, 
the inhibitor would be diluted along with the 
sample in the dilution analysis (Pig, 2), altering 
the expected C r value change, Each sample am- 
plification yielded a similar result in the analysis, 
demonstrating that this method of sample prepa- 
ration is highly reproducible with regard lo 
sample purity. 

OuantitarJve Analvsis of a Plasmid After 
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Table 1. Reproducibility of («mplc Preparation Method 



100 ng 



25 ng 



T 
2 
3 
4 
5 
6 
7 
8 
9 

10 

Mean 



Samplo 

no. C T 



standard 
m£*n deviation 



CV 



18,24 
18.23 
.13,33 
18.33 
18.35 
18.44 
18.3 
18.3 
18.42 
18,15 
18.23 
18.32 
18.4 
18.38 
18.4* 

18,54 
1 8.67 
19 

18.28 

18.36 

18.52 

18.45 

18.7 

18.73 

18.18 

18.34 

16.26 

18.42 

18.57 

1 8.66 

0 10) 



ie.27 

18.17 



0.06 
0.06 



18.34 0.07 



18.23 O.OS 



1B.42 0.04 



18.74 0.24 



18.39 0.12 



18.63 0.16 



18.29 0.1 



18.55 
18.12 



0,12 
0.17 



0,32 

0.36 
0.46 
0,23 
1.26 
0.66 
0.83 
0,55 

0.65 
0,90 



20.48 

20.S5 

20,5 

20.61 

20.59 

20.41 

20.54 

20.6 

20,49 

20.48 

20.44 

20.38 

20.68 

20.87 

20,63 

21.09 

21.04 

21.04 

20,67 

20,73 

20.65 

20.98 
20.84 
20.75 
20,46 
20.54 
20.48 
20.79 
20.78 
20.62 



mm* , * 4 



mean 

20,51 
70.54 
20.54 



20.86 

20.51 

20.73 
20.66 



standard 
deviation 



0.03 
0.11 
0.06 



20.4 3 0.05 



20.73 0.13 



21.06 0.03 



20.68 0.04 



0.12 

0.07 

0.1 
0.19 



cv 

0.17 

0.54 

0,28 

0.26 

0.61 

0.15 

0.2 

0.57 

0.32 

0.16 
0.94 



(or containing a partial cDNA for human factor 
VUI, pl-gTM. A aeries of tnirisfcciiom was set 
up using a decreasing amount of the plasmid v (40, 
4, 0.5, and 0.1 p,g). Twenty-four hours post- 
tranafecti on, tola) DhJA was purified from each 
flask uf cells. p-Aclin gene quantity wa> choM-n as 
a value for normal i^t it m of genomic. ON A con- 
centration from each sample. In this cxuexi/Jieiit, 
(i-actm gene content should remain constant 
relative to coral genomic DNA. Figure 3 show* the 
result of (he p-actln DNA measurement (100 ng 
total DNA determined hy ultraviolet spectros- 
copy) of each sample. Kach sain pie was analysed 
in triplicate and the mean p-actin Cr values of 
the triplicates were plotted (error bars represent 
r+?~i.i«ivi d«viai,nnt Ihp hiptwsr niffcrrnrr 



between any iwo sample* moan* was 0.95 C,. Ten 
nanograms of total UNA of each sample were also 
examined for p-actln. llic results ogam showed 
that very similar amount* of genomic DNA were 
present; the maximum moan |i actio C*;, value 
difference wa.* 1 .0. As Figure 3 shows, the rate of 
P-actln C r change between the 100 and 10-ng 
sample* was similar (slope values rangw bwtwoon 
3.56 anU - 3,45), This verifies again thai the 
method of sample preparation yields sarnpfos of 
identical PCR integrity (i,e~, no sample contained 
an excessive amount of a PCR inhibitor). How- 
ever, these results indicate that each sample con 
talncd slight diffcienc.es in the actual amount of 
genomic DNA analysed. Determination of actual 
uenomic i)NA concentration was accomplished 
90POI 20S6 092, 6*6 YVd 00:ST Z00Z/S0/ZT 
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Figure 2 Simple preparation purity. 1 he repficato 
samples shown In Table 1 wore also amplified In 
tripicate using 2S ng of each DNA sample. The fig* 
ui e shows the input DNA co n cent rat ion (1 00 and 
25 ng) vs. C, In lh<* finjirp. th#> 100 and ?S ng 
points for each sample are connected by a line. 



by plotting the mean fs-actio O, value obtained 
foT uat:h 100 sample on a ft-aciln standard 
i.-mve (shown in P$h- 40>. The actual genomic 
ONA concentration of each sinnpl«, ct, was ob 
taincd by extrapolation to the A'-axl*. 

Figure 4 A shows the measured (he., nort* 
normalised) quantities of factor VJJJ plnnnnid 
DNA (preTM) from each of the four transicxtl cell 
traductions. Each reaction contained 100 ng Of 
total sample ONA (as determined by UV spectros- 
copy). Vac\i sample was analyzed hi triplicate 
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Figure 3 Analyst of tidiisfectcd cdl DNA quantity 
and purity. I he DNA preparations of the lour 293 
cell transfeciions (40, 4, 0.5, and 0.1 jxg of pFSTM) 
were analyzed for the p-actln gene. TOO and 1 0 ng 
(determined by ultraviolet spectroscopy) of each 
sample were amplified in triplicate, For each 
amount of pF8TM that was transfected, the {3-aciln 
C T values are plotted versus the total input DNA 



|>Clk amplification*. As shown, pl'STM purified 
,h.oit> Jbe 293 cells decreases (mean C, values in- 
CTOtst'*; with decreasing amounts of plasmld 
ar«itsU*t.lcd- The mean C A values obtained for 
pFfcTM inTlgurc 4A were plotted on a standard 
curve comprised of seiially diluted pFHTM, 
shown .in figure 4B. The quantity ui pl-KTM, h, 
found in each of the four transections was do 
tcrmined by extrapolation to the x uxk of the 
standard curve In Pi^urc 4IU Th<*c uncorrected 
values, b, for pl«'8TM were iK>nt>*U"«d to deter- 
mine Uie actual amount of pl'8TM found per 100 
UK of genomic DNA by using the equation:. 



#> x 100 ng 



(I 



\ actual pFSTM copies per 
' T 100 ng of genomic DNA 



90HH 



where a - actual genomic DNA in u sample and 
I) w. pPtn*M copies from the standard curve. The 
normalised quantity of pl'BTM per 100 ng <>f ge- 
nomic DNA for cacti of the four transfer.! ions Js 
shown in Figure 4 J K 'Hicse roulls show mat the 
quantity of factor Vlll plasinid associated wiili 
the 293 cells, 24 lir after trui infection, dun eases 
with dccrcaslUH pJaMiiu) ccjut.eiuiatjoij used in 
tilt iraiisfcclion. The quantity of pl'BTM associ- 
ated with Z93 eclks, after irunsfectlon with 40 ng 
of pUtjemid, was 35 pg p^r 100 ng g«nuinlc I^NA. 
This rcsulrs in -520 plasniid copies per eelJ. 



WSCUSSION 

Wo have described a new method for quantitnt- 
jug gene copy numbers using rcaMlmc analysis 
of PL'R ampHficatlcms. ReaMimo PCH is compat- 
ible with cither of the two FOR (KT-PCR) ap- 
proaciio: (1) quantitative competitive where an 
internal competitor for each target sequence is 
used for nonnaHxatlon (data not shown) or (2) 
quantitative comparative PCH using a mnuMVua- 
tUm ^vne contained within the sample (i.e*, (3-ac- 
tiii) ox a "housekeeping" gene for RT-PGK. Ff 
equal amounts of nucleic add are analyzed for 
each sample and if the amplification efficiency 
before quantitative analysb is identical for each 
sample, the Internal control (nuimaliralloii gvnc 
or competitor) should Rlvc equal sinnals for al] 
samples. 

The real-time PCH method offers several ad- 
vantages over the other two methods currently 
employed (see the Introduction). First, the real- 
time PGR method is performed in a closed-tube 
system and requires no post-PCR manipulation 
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Figure 4 Quantitative anafytfx of pFSTM in transfected cells. (Vty Amount of 
plasmid DNiA used for I he trunsfectlon plotted against the mean C, value deter* 
minad for pr#TM remaining 24 hr alter transection. (0,C) Standard cunm* of 
pPATM *nd 0-actIn, respectively. priJTM DNA (0) and genomic f>NA (Q were 
dilutAd Serially 1 ;S before amplification with the appropriate primer*. The p-acttn 
standard curve wa* u$od to normalise the results of A to 1 00 ny of genomic DNA. 
(0) Tho amount of pfSTM present per 100 ng of genomic DNA. 



of sample. Therefore, I ho potential for rCK con- 
lamination in the laboratory is reduced because 
Amplified products can be analysed and disposed 
of without opening the reaction tubes. Second, 
this method suppoils the u.se of a normalisation 
gene (i.e., P-actin) for quantitative. PCR or house- 
keeping genes for quantitative RT-l'Ck controls. 
Analysis Is performed in real time during the Jog 
phase of product accumulation. Analysis during 
lug phase permits many different genes (over a 
wide input target range) to be analysed simulta- 
neously, without concern of reaching rend Ion 
plateau at different cycles. Tins will make imiHI- 
gene analysis assays much coital lu develop, be- 
cause individual internal uiutpetUon will nol be 
needed for each gene under analysis. Third, 
sample throughput will imieasc dramatically 
with the new method because, there is no post- 
PCK processing time. Additionally, walking hi a 
°6-well format Is highly compatible with auto- 
ination technology, 

The real-lime PCR method is highly repro- 
ducible. Replicate amplifications can be analyzed 



for each sample minimizing potential error. The. 
sysluiti allows for a very large assay dynamic 
range (approaching 1,000,000 -fold starting Uu- 
gel). Using a .standard curve for the. target oi in- 
terest, relative copy number values can be deter- 
mined for any unknown sample. Fluorescent 
threshold values, G r , coneJatr. linearly with rela- 
tive UNA copy numbers. Heal time quantitative 
KT- PCH methodology (Gibson ct ah, this Issue) 
has also been developed, finally, real time quan- 
titative f*CU methodology can be used to dcvclup 
high-throughput screening assays for a variety of 
applications [quantitative gene capicaaiun (KT- 
rOI^)j gene copy assays (fieri, IllV, etc.), geno- 
.typlng (knockout mouse analysis), and Immuiio- 

roil]. 

Re.al-timc l*CA\ may also 1>« performed using 
intercalating dyes (HlgucHi ct al- 1WJ such as 
ciJiJdium bromide. The fluorogenic probe 
method offers a major advantage over inter- 
calating dyes- greater specificity (i.e., primer 
dtmers and nonspecific PCR products are not de- 
tected). 
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METHODS 

Generation of <t Plasmld Containing a Partial 
cDNA for Human Factor VIH 

Total KNA w«a harvested (UNA*** 1 W Crom I"**** 
hn^ndswood, TX) froxu ctdlj* inMtafeclvd with a factor VI 11 
expression vector, pC:iS2<tk?.&!) (Kattm ft ftl. Gcm* 
man c.t al. 1900). A factor VIII partial chNA vomicmv W«S 
^n«.ratcd by HT \'CM IGoneAmp KZ iTlll UNA W.U Kit 

(part NWMwnvs, rfc Applied WosyMCins, Ikwtvi <'»<y, <-^A>J 

using the I'C |7iioier$ KKfar »md I-Hrcv (prinwr sequence* 
arc ahown below), The ampHcon was feaniplificd usinfc 
modified HNoc and Wrcv primers («ppcudi<d with HumW 
and HhnlUl restriction sire sequences at tin* h' endf and 
clonal i"t<> p«KM- 3Z(l'iwiN<ga CUvp.. Madron, Wl). Tho 
rcsullln^cionr, pV8TM, was used lor transient transfccilon 
of £93 cells. 



Amplification of Target DNA ami Dulccilon of 
Amplicpn Factor VIII Plasmld DNA 

(pFfiTM) was amplified with the piimei* l*8for 5'-C<;<:- 
Crr<J(;<^\ACiAU:itjAlAilCiTC-3' and » ! «rev .V-AAACGT- 
*;*aCXnXKSATCttjTAGG-3'.Th« rvttctlun piodueed a 422- 
nj> I'CK product. The forward primer wo* de>b;ne;d lu lev 
ognlxu ti unique 1 M'ljuiMHi" A mi id In the ii' untranslated 
rctflon of Ukt patent pC132.fckZ5l> pltwind and therefore 
does nut H'VU>piUe. mid amplify the human factor VIII 
gene, f'rimori; wore chowu with thy mivUmrr* r»T llw com- 
puter program Oliflo 1.CI (Notional biosciences, lm\, Ply- 
mouth, MN). The human p-actl" giw amplified with 
the prima* fi-m im forward primer 5 TCACCOAOAf 7I*< ST 
GCCCATCPAGGA-.r and p-actin reverse pernor S'-CAC;. 
C0GAACCX;fri*<:AH(;(:c^VAJ'GG-3'. The reaction pro- 
duced a 2v5-np pOU product. 

Amplificatton reactions (SO f*l) contained a DNA 
sample, )0x PCR Buffer II (S uJ), 200 u.m dAlT, dCTP, 
dGTP, and 400 p,M riUTI», A mx< Mtfllj, l.XS Until Ampll 
Tm) DNA polymcuwc, 0.5 unit AmpKrnsc uracil /V-fiiy- 
co.iylunv (UNO), £0 pinole of each factoi Vlll |trlmei, und 15 
pmute of outdi |t actio pdmer. Tha inaction* aKo *:unuiacd 
one of On- fojipwin^ (lrtvrtUm proh^s (inn nu nirli): 
VHprvh* A'(PAU>Af:^rrcri , c:c:At:frr<sr;rn < (:'riT<:TCT- 

GCCTT(TAWRA)p 3'«ud p^titiiu probe 5 r (FAM)ATGCX:c:- 
X(TAMKA)CCCCCATGCC.ATC|>-3 > wl,rrr p indieales 
phrttphnrylAtton nnd X Indlootc^a linker arm nucleotide. 
Reaction IuIk-j w«Tt« Mutit>Afnp l'>pik*at Tuhct (part i\um- 
1.ktNK01 mXK, l»crkln lUmer) tliat wore fronted («t IVrWn 
ntmcr) to prvvonl liyhl from reflecting. Tube capi were 
Similar to Mien>At\ij) Cnpa but specially designed to pre- 
vent \\&U% scatter? it 5. All ot tliC \K'M ^itirtutuiilvU-* were »u>*- 
plid Ivy Pi: Applied lUotfy^en.s (IWer CMy, CA.) execpl 
the factor VIU primers, wliicH weie syot hcslxrd at Cenen 
iccli, Inc. (South ft ( tt 1 rrandseoj CA). Probes ww designed 
using the Olj£o 4.0 .toflwarc, following guidelines buj;- 
^csien in tnc Model 770f» .sequence Detiruw lu.ittuuinil 
manual, briefly, probe T m ^Jniutd hi- M leost 5 W C higher 
man Clir amnwilUut teiiipeialurt: u.icd durbifi ilirrmul cy- 
rlutg; primers shoyld nut tvttn sUhW duplexed with *he 
probe. 

The tiieriiJ4i] cyrllng conditions Included 2 juln *i 
5U v Cand 10 ruin at 95°C. 'fliej-mal cycling prorrrded with 
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reactionv were perfonncd U\ tho Model 77(H) Sequence IV- 
ItH-tor (PU AppMed Ulusysluiiiv), uahlrh conialuv * Octu- 
Amp l'<:U SyM«ni Inaction ccuidition^ w<-rr- pvo» 

gruiiuitcU mii .i I'wwor Mnciutt^b VI 0(1 (Apple CemmiHer, 
Santa Clara, c:a) unKcd cbwtly to the Model WOO 
queiKv !><itffctor« An»ty«U *>f data w*v aUo porf/»rm#«d on 
the Mm-tnlftsh enmputer. CVvUoettoit and HualyKU coflware 
win dovelojwd xt PK Applied UlcKyxtums. 

Traiwfection of Cells with Factor Vlll Coiulrucl 

j-our T17.S Hasks of 293 cells (ATCX: CJZL. 1571^), a human 
felol kidney suspei^icm cell line, wvre grown to 80% con- 
Itueocy anwl transfewd pl-'KTM. Colls were zrown in lliv 
fr>Ilciwlug incdlAi S(W» HAM'S m without GUT, 50% lt»w 
glucose nuJbcvon'3 modified Kaxlcincdium (l^MICM) with- 
out glyciitt: with fiodiuni bicarbonate, 10% letal bovine 
scrurn, 2 him L-gluUiriifK, i>od 1% penicilJiiwtfrntomy- 

tin. The media was e.)ianh' cd 30 bcAw Iransfcc 
Linn< plHSTM WA Amoimta of 40, A, OS, and 0.1 ^1? wviV 
added h» 1..S ml of a solution containing 0.125 m CmC\ ? ; 
and 1 x IH'J'hS, The four roixtnrva were left at room tern- 

|.ie.ni1«rv f<»i TO nun and then added Hmjiwlu- u> llio cells. 
The n«*K> Mrvie.inuuUatcd al 37°C and 5,% t'lO. for 24 hr, 
washed with PUS, and r<i*u*pc.ndcd In l^HS. The rcsciiH 
jn-nd^l ccll.1 were divided into »Iupmt» und l.^NA was 
tTMCted Immediately usinR IhvQJAanip Kioi„l Kit (Qiapien. 
ChcttamTrtli, <.^), ON A was elided Into 200 ul 30 mM 
TrMIOJalpllH.0, 
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methods. Peptides AENK or AEQK were dissolved in water, made isotonic with 
NaCl and diluted into RPMI growth medium. T-cell-proliferation assays were 
done essentially as described 20,21 . Briefly, after antigen pulsing (30 fig ml" 1 
TTCF) with tetrapeptides (l-Zmgrnl" 1 ), PBMCs or EBV-B cells were 
washed in PBS and fixed for 45 s in 0.05% glutaraldehyde. Glycine was added 
to a final concentration of 0.1 M and the cells were washed five times in RPMI 
1640 medium containing 1% FCS before co-culture with T-cell clones in 
round-bottom 96-well microtitre plates. After 48 h, the cultures were pulsed 
with 1 u.Ci of 3 H -thymidine and harvested for scintillation counting 16 h later. 
Predigestion of native TTCF was done by incubating 200 u-g TTCF with 0.25 u.g 
pig kidney legumain in 500 u.1 50 mM citrate buffer, pH 5.5, for 1 h at 37 °C. 
Glycopeptide digestions. The peptides HIDNEED1, HlDN(N-glucosamine) 
EEDI and HIDNESDI, which are based on the TTCF sequence, and 
QQQHLFGSNVTDCSGNFCLFR(KKK), which is based on human transferrin, 
were obtained by custom synthesis. The three C-terminal lysine residues were 
added to the natural sequence to aid solubility. The transferrin glycopeptide 
QQQHLFGSNVTDCSGNFCLFR was prepared by tryptic ( Promega) digestion 
of 5 mg reduced, carboxy-methylated human transferrin followed by 
concanavalin A chromatography 11 . Glycopeptides corresponding to residues 
622-642 and 421-452 were isolated by reverse-phase HPLC and identified by 
mass spectrometry and N-terminal sequencing. The lyophilized transferrin - 
derived peptides were redissolved in 50 mM sodium acetate, pH 5.5, 10 mM 
dithiothreitol, 20% methanol. Digestions were performed for 3 h at 30 °C with 
5-50 mUmr 1 pig kidney legumain or B-cell AEP. Products were analysed by 
HPLC or MALDI-TOF mass spectrometry using a matrix of lOmgrnl" 1 o> 
cyanocinnamic acid in 50% acetonitrile/0.1% TFA and a PerSeptive Biosystems 
Elite STR mass spectrometer set to linear or reflector mode. Internal standar- 
dization was obtained with a matrix ion of 568.13 mass units. 
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Fas ligand (FasL) is produced by activated T cells and natural 
killer cells and it induces apoptosis (programmed cell death) in 
target cells through the death receptor Fas/Apol/CD95 (ref. 1). 
One important role of FasL and Fas is to mediate immune- 
cytotoxic killing of cells that are potentially harmful to the 
organism, such as virus-infected or tumour cells'. Here we 
report the discovery of a soluble decoy receptor, termed decoy 
receptor 3 (DcR3), that binds to FasL and inhibits FasL-induced 
apoptosis. The DcR3 gene was amplified in about half of 35 
primary lung and colon tumours studied, and DcR3 messenger 
RNA was expressed in malignant tissue. Thus, certain tumours 
may escape FasL-dependent immune-cytotoxic attack \by expres- 
sing a decoy receptor that blocks FasL. 

By searching expressed sequence tag (EST) databases, we identi- 
fied a set of related ESTs that showed homology to the tumour 
necrosis factor (TNF) receptor (TNFR) gene superfamily 2 . Using 
the overlapping sequence, we isolated a previously unknown full- 
length complementary DNA from human fetal lung. We named the 
protein encoded by this cDNA decoy receptor 3 (DcR3). The cDNA 
encodes a 300-amino-acid polypeptide that resembles members of 
the TNFR family (Fig. la): the amino terminus contains a leader 
sequence, which is followed by four tandem cysteine -rich domains 
(CRDs). Like one other TNFR homologue, osteoprotegerin (OPG)\ 
DcR3 lacks an apparent transmembrane sequence, which indicates 
that it may be a secreted, rather than a membrane-asscociated, 
molecule. We expressed a recombinant, histidine-tagged form of 
DcR3 in mammalian cells; DcR3 was secreted into the cell culture 
medium, and migrated on polyacrylamide gels as a protein of 
relative molecular mass 35,000 (data not shown). DcR3 shares 
sequence identity in particular with OPG (31%) and TNFR2 
(29%), and has relatively less homology with Fas (17%). All of 
the cysteines in the four CRDs of DcR3 and OPG are conserved; 
however, the carboxy- terminal portion of DcR3 is 101 residues 
shorter. 

We analysed expression of DcR3 mRNA in human tissues by 
northern blotting (Fig. lb). We detected a predominant 1.2-kilobase 
transcript in fetal lung, brain, and liver, and in adult spleen, colon 
and lung. In addition, we observed relatively high DcR3 mRNA 
expression in the human colon carcinoma cell line SW480. 

To investigate potential ligand interactions of DcR3, we generated 
a recombinant, Fc-tagged DcR3 protein. We tested binding of 
DcR3-Fc to human 293 cells transfected with individual TNF- 
family ligands, which are expressed as type 2 transmembrane 
proteins (these transmembrane proteins have their N termini in 
the cytosol). DcR3-Fc showed a significant increase in binding to 
cells transfected with FasL 4 (Fig. 2a), but not to cells transfected with 
TNF 5 , Apo2L/TRAIL 6,7 , Apo3L/TWEAK lt ' 9 , or OPGL/TRANCE/ 
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RANKL 10 " 12 (data not shown). DcR3-Fc immunoprecipitated shed 
FasL from Fas L- trans fected 293 cells (Fig. 2b) and purified soluble 
FasL (Fig. 2c), as did the Fc-tagged ectodomain of Fas but not 
TNFR1. Gel-filtration chromatography showed that DcR3-Fc and 
soluble FasL formed a stable complex (Fig. 2d). Equilibrium 
analysis indicated that DcR3-Fc and Fas-Fc bound to soluble 
FasL with a comparable affinity (K d = 0.8 ± 0.2 and 
l.liO.InM, respectively; Fig. 2e), and that DcR3-Fc could 
block nearly all of the binding of soluble FasL to Fas-Fc (Fig. 2e, 
inset). Thus, DcR3 competes with Fas for binding to FasL. 

To determine whether binding of DcR3 inhibits FasL activity, we 
tested the effect of DcR3-Fc on apoptosis induction by soluble 
FasL in Jurkat T leukaemia cells, which express Fas (Fig. 3a). DcR3- 
Fc and Fas-Fc blocked soluble- FasL-induced apoptosis in a 
similar dose-dependent manner, with half-maximal inhibition at 
-0.1 fLgml -1 . Time-course analysis showed that the inhibition did 
not merely delay cell death, but rather persisted for at least 24 hours 
(Fig. 3b). We also tested the effect of DcR3-Fc on activation- 
induced cell death (AICD) of mature T lymphocytes, a FasL- 
dependent process 1 . Consistent with previous results 13 , activation 
of interleukin-2-stimulated CD4-positive T cells with anti-CD3 
antibody increased the level of apoptosis twofold, and Fas-Fc 
blocked this effect substantially (Fig. 3c); DcR3-Fc blocked the 



induction of apoptosis to a similar extent. Thus, DcR3 binding 
blocks apoptosis induction by FasL. 

FasL-induced apoptosis is important in elimination of virus- 
infected cells and cancer cells by natural killer cells and cytotoxic T 
lymphocytes; an alternative mechanism involves perforin and 
granzymes 1 ' 14 "' 6 . Peripheral blood natural killer cells triggered 
marked cell death in Jurkat T leukaemia cells (Fig. 3d); DcR3-Fc 
and Fas-Fc each reduced killing of target cells from —65% to 
-30%, with half-maximal inhibition at -1 u-gmf 1 ; the residual 
killing was probably mediated by the perforin/granzyme pathway. 
Thus, DcR3 binding blocks FasL-dependent natural killer cell 
activity. Higher DcR3-Fc and Fas-Fc concentrations were required 
to block natural killer cell activity compared with those required to 
block soluble FasL activity, which is consistent with the greater 
potency of membrane-associated FasL compared with soluble 
FasL 17 . 

Given the role of immune-cytotoxic cells in elimination of 
tumour cells and the fact that DcR3 can act as an inhibitor of 
FasL, we proposed that DcR3 expression might contribute to the 
ability of some tumours to escape immune-cytotoxic attack. As 
genomic amplification frequently contributes to tumorigenesis, we 
investigated whether the DcR3 gene is amplified in cancer. We 
analysed DcR3 gene-copy number by quantitative polymerase chain 
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Figure 1 Primary structure and expression of human DcR3. a. Alignment of the 
amino-acid sequences of DcR3 and of osteoprotegerin (OPG); the C-terminal 101 
residues of OPG are not shown. The putative signal cleavage site (arrow), the 
cysteine-rich domains (CRD 1-4), and the/V-linkedglycosylation site (asterisk) are 
shown, b. Expression of OcR3 mRNA. Northern hybridization analysis was done 
using the DcR3 cDNA as a probe and blots of poly(A)* RNA (Clontech) from 
human fetal and adult tissues or cancer cell lines. PBL peripheral blood 
lymphocyte. 



Figure 2 Interaction of DcR3 with FasL a, 293 cells were transfected with pRK5 
vector (top) or with pRK5 encoding full-length FasL (bottom), incubated with 
DcR3-Fc (solid line, shaded area). TNFRl-Fc (dotted line) or buffer control 
(dashed line) (the dashed and dotted lines overlap), and analysed for binding by 
FACS. Statistical analysis showed a significant difference (P < 0.001 ) between the 
binding of DcR3-Fc to cells transfected with FasL or pRK5. PE, phycoerythrin- 
labelled cells, b, 293 cells were transfected as in a and metabolically labelled, and 
cell supernatants were immunoprecipitated with Fc-tagged TNFR1, DcR3 or Fas. 
c. Purified soluble FasL (sFasL) was immunoprecipitated with TNFR1-Fc. DcR3- 
Fc or Fas-Fc and visualized by immunoblot with anti-FasL antibody. sFasL was 
loaded directly for comparison in the right-hand lane, d. Flag-tagged sFasL was 
incubated with DcR3-Fc or with buffer and resolved by gel filtration; column 
fractions were analysed in an assay that detects complexes containing DcR3-Fc 
and sFasL-Flag. e, Equilibrium binding of DcR3-Fc or Fas-Fc to sFasL-Flag. 
Inset, competition of DcR3-Fc with Fas-Fc for binding to sFasL-Flag. 
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reaction (PCR) 18 in genomic DNA from 35 primary lung and colon 
tumours, relative to pooled genomic DNA from peripheral blood 
leukocytes (PBLs) of 10 healthy donors. Eight of 18 lung tumours 
and 9 of 17 colon tumours showed DcR3 gene amplification, 
ranging from 2- to 18-fold (Fig. 4a, b). To confirm this result, we 
analysed the colon tumour DNAs with three more, independent sets 
of DcR3 -based PCR primers and probes; we observed nearly the 
same amplification (data not shown). 

We then analysed DcR3 mRNA expression in primary tumour 
tissue sections by in situ hybridization. We detected DcR3 expres- 
sion in 6 out of 15 lung tumours, 2 out of 2 colon tumours, 2 out of 5 
breast tumours, and 1 out of 1 gastric tumour (data not shown). A 
section through a squamous-cell carcinoma of the lung is shown in 
Fig. 4c. DcR3 mRNA was localized to infiltrating malignant epithe- 
lium, but was essentially absent from adjacent stroma, indicating 
tumour-specific expression. Although the individual tumour speci- 
mens that we analysed for mRNA expression and gene amplification 
were different, the in situ hybridization results are consistent with 
the finding that the DcR3 gene is amplified frequently in tumours. 
SW480 colon carcinoma cells, which showed abundant DcR3 
mRNA expression (Fig. lb), also had marked DcR3 gene amplifica- 
tion, as shown by quantitative PCR (fourfold) and by Southern blot 
hybridization (fivefold) (data not shown). 

If DcR3 amplification in cancer is functionally relevant, then 
DcR3 should be amplified more than neighbouring genomic 
regions that are not important for tumour survival. To test this, 
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Figure 3 Inhibition of FasL activity by DcR3. a, Human Jurkat T leukaemia cells 
were incubated with Rag-tagged soluble FasL (sFasL:. Sngml"') oligomerized 
with anti-Flag antibody (0.1 jigmr 1 ) in the presence of the proposed inhibitors 
DcR3-Fc, Fas-Fc or human IgGl and assayed for apoptosis (mean ± s.e.m. of 
triplicates), b, Jurkat cells were incubated with sFasL-Flag plus anti-Flag antibody 
as in a, in presence of 1 u.g ml"' DcR3-Fc (filled circles). Fas-Fc (open circles) or 
human IgGl (triangles), and apoptosis was determined at the indicated time 
points, c. Peripheral blood T cells were stimulated with PHA and interteukin-2, 
followed by control (white bars) or anti-CD3 antibody (filled bars), together with 
phosphate-buffered saline (PBS), human IgGl, Fas-Fc. or DcR3-Fc (tO^gml" 1 ). 
After 16 h, apoptosis of CD4* cells was determined (mean i s.e.m. of results from 
five donors), d, Peripheral blood natural killer cells were incubated with 5, Cr- 
labelled Jurkat cells in the presence of DcR3-Fc (filled circles). Fas-Fc (open 
circles) or human IgGl (triangles), and target-cell death was determined by 
release of 5, Cr (mean ± s.d. for two donors, each in triplicate). 



we mapped the human DcR3 gene by radiation-hybrid analysis; 
DcR3 showed linkage to marker AFM2 18xe7 (T160), which maps to 
chromosome position 20ql3. Next, we isolated from a bacterial 
artificial chromosome (BAC) library a human genomic clone that 
carries DcR3, and sequenced the ends of the clone's insert. We then 
determined, from the nine colon tumours that showed twofold or 
greater amplification of DcR3, the copy number of the DcR3- 
flanking sequences (reverse and forward) from the BAC, and of 
seven genomic markers that span chromosome 20 (Fig. 4d). The 
DcR3 -linked reverse marker showed an average amplification of 
roughly threefold, slightly less than the approximately fourfold 
amplification of DcR3; the other markers showed little or no 
amplification. These data indicate that DcR3 may be at the 'epi- 
centre' of a distal chromosome 20 region that is amplified in colon 
cancer, consistent with the possibility that DcR3 amplification 
promotes tumour survival. 

Our results show that DcR3 binds specifically to FasL and inhibits 
FasL activity. We did not detect DcR3 binding to several other TNF- 
ligand-family members; however, this does not rule out the possi- 
bility that DcR3 interacts with other ligands, as do some other 
TNFR family members, including OPG 2 lv . 

FasL is important in regulating the immune response; however, 
little is known about how FasL function is controlled. One mechan- 
ism involves the molecule cFLIP, which modulates apoptosis signal- 
ling downstream of Fas 20 . A second mechanism involves proteolytic 
shedding of FasL from the cell surface 17 . DcR3 competes with Fas for 
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Figure 4 Genomic amplification of DcR3 in tumours, a. Lung cancers, comprising 
eight adenocarcinomas (c, d. f, g. h, j, k, r), seven squamous-cell carcinomas (a. e. 
m, n. o. p, q), one non-small-cell carcinoma (b). one small-cell carcinoma (i), and 
one bronchial adenocarcinoma (I). The data are means ± s.d. of 2 experiments 
done in duplicate, b, Colon tumours, comprising 17 adenocarcinomas. Data are 
means ± s.e.m. of five experiments done in duplicate, c. In situ hybridization 
analysis of DcR3 mRNA expression in a squamous-cell carcinoma of the lung. A 
representative bright-field image (left) and the corresponding dark-field image 
(right) show DcR3 mRNA over infiltrating malignant epithelium (arrowheads). 
Adjacent non-malignant stroma (S), blood vessel (V) and necrotic tumour tissue 
(N) are also shown, d, Average amplification of DcR3 compared with amplifica- 
tion of neighbouring genomic regions (reverse and forward. Rev and Fwd), the 
DcR3-linked marker T160, and other chromosome-20 markers, in the nine colon 
tumours showing DcR3 amplification of twofold or more (b). Data are from two 
experiments done in duplicate. Asterisk indicates P < 0.01 for a Student's r-test 
comparing each marker with DcR3. 
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FasL binding; hence, it may represent a third mechanism of 
extracellular regulation of FasL activity. A decoy receptor that 
modulates the function of the cytokine interleukin-1 has been 
described 21 . In addition, two decoy receptors that belong to the 
TNFR family, DcRl and DcR2, regulate the FasL- related apoptosis - 
inducing molecule Apo2L 22 . Unlike DcRl and DcR2, which are 
membrane-associated proteins, DcR3 is directly secreted into the 
extracellular space. One other secreted TNFR-family member is 
OPG\ which shares greater sequence homology with DcR3 (31%) 
than do DcRl (17%) or DcR2 (19%); OPG functions as a third 
decoy for Apo2L' 9 . Thus, DcR3 and OPG define a new subset of 
TNFR-family members that function as secreted decoys to mod- 
ulate Hgands that induce apoptosis. Pox viruses produce soluble 
TNFR homologues that neutralize specific TNF-family ligands, 
thereby modulating the antiviral immune response 2 . Our results 
indicate that a similar mechanism, namely, production of a soluble 
decoy receptor for FasL, may contribute to immune evasion by 
certain tumours. □ 



Methods 

Isolation of DcR3 cDNA. Several overlapping ESTs in GenBank (accession 
numbers AA025672, AA025673 and W67560) and in Lifeseq™ (Incyte 
Pharmaceuticals; accession numbers 1339238, 1533571, 1533650, 1542861, 
1789372 and 2207027) showed similarity to members of the TNFR family. We 
screened human cDNA libraries by PCR with primers based on the region of 
EST consensus; fetal lung was positive for a product of the expected size. By 
hybridization to a PCR-generated probe based on the ESTs, one positive clone 
(DNA30942) was identified. When searching for potential alternatively spliced 
forms of DcR3 that might encode a transmembrane protein, we isolated 50 
more clones; the coding "regions of these clones were identical in size to that of 
the initial clone (data not shown). 

Fc-fusion proteins (immunoadhesins). The entire DcR3 sequence, or the 
ectodomain of Fas or TNFR1, was fused to the hinge and Fc region of human 
IgGl, expressed in insect SF9 cells or in human 293 cells, and purified as 
described". 

Fluorescence-activated cell sorting (FACS) analysis. We trans feet ed 293 
cells using calcium phosphate or Effectene (Qiagen) with pRK5 vector or pRK5 
encoding full-length human FasL 4 (2 jj.g), together with pRK5 encoding CrmA 
(2u.g) to prevent cell death. After 16 h, the cells were incubated with 
biotinylated DcR3-Fc or TNFRl-Fc and then with phycoerythr in -conjugated 
streptavidin (GibcoBRL), and were assayed by FACS. The data were analysed by 
Kolmogorov-Smirnov statistical analysis. There was some detectable staining 
of vector-transfected cells by DcR3-Fc; as these cells express little FasL (data 
not shown), it is possible that DcR3 recognized some other factor that is 
expressed constitutively on 293 cells. 

Immunoprecipttatlon. Human 293 cells were transfected as above, and 
metabolically labelled with j 35 S]cysteine and [ 35 S| methionine (0.5 mCi; 
Amersham). After 16 h of culture in the presence of z-VAD-fmk (l0u,M), 
the medium was immunoprecipitated with DcR3-Fc, Fas-Fc or TNFRl-Fc 
(5p,g), followed by protein A-Sepharose (Repligen). The precipitates were 
resolved by SDS-PAGE and visualized on a phosphorimager (Fuji BAS2000). 
Alternatively, purified, Flag-tagged soluble FasL ( 1 p.g) (Alexis) was incubated 
with each Fc-fusion protein (1 u.g), precipitated with protein A-Sepharose, 
resolved by SDS-PAGE and visualized by immunoblotting with rabbit anti- 
FasL antibody (Oncogene Research). 

Analysis of complex formation. Flag-tagged soluble FasL (25u.g) was 
incubated with buffer or with DcR3-Fc (40 »xg) for 1.5 h at 24 °C. The reaction 
was loaded onto a Superdex 200 HR 10/30 column (Pharmacia) and developed 
with PBS; 0.6-ml fractions were collected. The presence of DcR3-Fc-FasL 
complex in each fraction was analysed by placing 100 u.1 aliquots into microtitre 
wells precoated with anti-human IgG (Boehringer) to capture DcR3-Fc, 
followed by detection with biotinylated a nti- Flag antibody Bio M2 (Kodak) and 
streptavidin -horseradish peroxidase (Amersham). Calibration of the column 
indicated an apparent relative molecular mass of the complex of 420K (data not 
shown), which is consistent with a stoichiometry of two DcR3-Fc homodimers 
to two soluble FasL homotrimers. 

Equilibrium binding analysis. Microtitre wells were coated with anti-human 
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IgG, blocked with 2% BSA in PBS. DcR3-Fc or Fas-Fc was added, followed by 
serially diluted Flag-tagged soluble FasL. Bound ligand was detected with anti- 
Flag antibody as above. In the competition assay, Fas-Fc was immobilized as 
above, and the wells were blocked with excess IgGl before addition of Flag- 
tagged soluble FasL plus DcR3-Fc. 

T-cell AICO. CD3* lymphocytes were isolated from peripheral blood of 
individual donors using anti-CD3 magnetic beads (Miltenyi Biotech), 
stimulated with phytohaemagglutinin (PHA; 2 u-g ml"') for 24 h, and cultured 
in the presence of interleukin-2 ( 100 U ml* 1 ) for 5 days. The cells were plated in 
wells coated with anti-CD3 antibody (Pharmingen) and analysed for apoptosis 
16 h later.by FACS analysis of annexin-V-binding of CD4* cells 24 . 
Natural killer cell activity. Natural killer cells were isolated from peripheral 
blood of individual donors using anti-CD56 magnetic beads (Miltenyi 
Biotech), and incubated for 16 h with 5, Cr-loaded Jurkat cells at an effector- 
to-target ratio of 1:1 in the presence of DcR3-Fc, Fas-Fc or human IgGl. 
Target-cell death was determined by release of 5, Cr in effector-target co- 
cultures relative to release of 51 Cr by detergent lysis of equal numbers of Jurkat 
cells. 

Gene-amplification analysis. Surgical specimens were provided by J. Kern 
(lung tumours) and P. Quirke (colon tumours). Genomic DNA was extracted 
(Qiagen) and the concentration was determined using Hoechst dye 33258 
intercalation fluorometry. Amplification was determined by quantitative PCR" 
using a TaqMan instrument (ABI). The method was validated by comparison of 
PCR and Southern hybridization data for the Myc and HER- 2 oncogenes (data 
not shown). Gene-specific primers and fluorogenic probes were designed on 
the basis of the sequence of DcR3 or of nearby regions identified on a BAC 
carrying the human DcR3 gene; alternatively, primers and probes were based 
on Stanford Human Genome Center marker AFM218xe7 (T160), which is 
linked to DcR3 (likelihood score = 5.4), SHGC-36268 (T159), the nearest 
available marker which maps to —500 kilobases from T160, and five extra 
markers that span chromosome 20. The DcR3 -specific primer sequences were 
5'-CTTCTTCGCGCACGCTG-3' and 5'-ATCACGCCGGCACCAG-3' and the 
fluorogenic probe sequence was 5'-(FAM-ACACGATGCGTGCTCCAAGCAG 
AAp-(TAMARA), where FAM is 5' -fluorescein phosphoramidite. Relative 
gene-copy numbers were derived using the formula 2 <ACT) , where ACT is the 
difference in amplification cycles required to detect DcR3 in peripheral blood 
lymphocyte DNA compared to test DNA. 
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ABC transporters (also known as traffic ATPases) form a large 
family of proteins responsible for the translocation of a variety 
of compounds across membranes of both prokaryotes and 
eukaryotes'. The recently completed Escherichia coli genome 
sequence revealed that the largest family of paralogous £. coli 
proteins is composed of ABC transporters 2 . Many eukaryotic 
proteins of medical significance belong to this family, such as 
the cystic fibrosis transmembrane conductance regulator (CFTR), 
the P-glycoprotein (or multidrug-resistance protein) and the 
heterodimeric transporter associated With antigen processing 
(Tapl-Tap2). Here we report the crystal structure at 1.5 A resolu- 
tion of HisP, the ATP-binding subunit of the histidine permease, 
which is an ABC transporter from Salmonella typhimurium. We 
correlate the details of this structure with the biochemical, genetic 
and biophysical properties of the wild-type and several mutant 
HisP proteins. The structure provides a basis for understanding 
properties of ABC transporters and of defective CFTR proteins. 

ABC transporters contain four structural domains: two nucleo- 
tide-binding domains (NBDs), which are highly conserved 
throughout the family, and two transmembrane domains 1 . In 
prokaryotes these domains are often separate subunits which are 
assembled into a membrane -bound complex; in eukaryotes the 
domains are generally fused into a single polypeptide chain. The 
periplasmic histidine permease of 5. typhimurium and £ coli u ~* is a 
well-characterized ABC transporter that is a good model for this 
superfamily. It consists of a membrane-bound complex, HisQMP 2 , 
which comprises integral membrane subunits, HisQ and HisM, and 
two copies of HisP, the ATP-binding subunit. HisP, which has 
properties intermediate between those of integral and peripheral 
membrane proteins 9 , is accessible from both sides of the membrane, 
presumably by its interaction with HisQ and HisM 6 . The two HisP 
subunits form a dimer, as shown by their cooperativity in ATP 
hydrolysis 5 , the requirement for both subunits to be present for 
activit/, and the formation of a HisP dimer upon chemical cross- 
linking. Soluble HisP also forms a dimer 3 . HisP has been purified 
and characterized in an active soluble form 3 which can be recon- 
stituted into a fully active membrane-bound complex 8 . 

The overall shape of the crystal structure of the HisP monomer is 
that of an T with two thick arms (arm I and arm II); the ATP- 
binding pocket is near the end of arm I (Fig. I). A six-stranded p- 
sheet (33 and 08-012) spans both arms of the L, with a domain of a 
o> plus P-type structure (pi, P2, 04-07, al and a2) on one side 
(within arm I) and a domain of mostly a-helices (ct3^a9) on the 



a 



ARM II 




ARM I 




Y16 HJ9 



K45 G39 




Figure 1 Crystal structure of HisP. a, View of the dimer along an axis 
perpendicular to its two-fold axis. The top and bottom of the dimer are suggested 
to face towards the periplasmic and cytoplasmic sides, respectively (see text). 
The thickness of arm II is about 25 A, comparable to that of membrane. a-Helices 
are shown in orange and p-sheets in green, b, View along the two-fold axis of the 
HisP dimer, showing"" the relative displacement of the monomers not apparent in 
a. The fJ-strands at the dimer interface are labelled, c, View of one monomer from 
the bottom of arm I, as shown in a. towards arm !l. showing the ATP-binding 
pocket, a-c. The protein and the bound ATP are in 'ribbon' and 'ball-and-stick* 
representations, respectively. Key residues discussed in the text are indicated in 
c. These figures were prepared with MOLSCRIPT 29 . N. amino terminus; C, C 
terminus. 
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Gene amplification is a common event in the progression of 
human cancers, and amplified oncogenes have been shown to 
have diagnostic, prognostic and therapeutic relevance. A 
kinetic quantitative polymerase-chain-reaction (PCR) method, 
based on fluorescent TaqMan methodology and a new instru- 
ment (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real-time, was used to quantify 
gene amplification in tumor DNA. Reactions are character- 
ized by the point during cycling when PCR amplification is still 
in the exponential phase, rather than the amount of PCR 
product accumulated after a fixed number of cycles. None of 
the reaction components is limited during the exponential 
phase, meaning that values are highly reproducible in reac- 
tions starting with the same copy number. This greatly 
improves the precision of DNA quantification. Moreover, 
real-time PCR does not require post-PCR sample handling, 
thereby preventing potential PCR-product carry-over con- 
tamination; it possesses a wide dynamic range of quantifica- 
tion and results in much faster and higher sample throughput. 
The real-time PCR method, was used to develop and validate 
a simple and rapid assay for the detection and quantification 
of the 3 most frequently amplified genes (myc, ccndl and 
erbB2) in breast tumors. Extra copies of myc, ccndl and erbB2 
were observed in 10, 23 and 15%, respectively, of 108 breast- 
tumor DNA; the largest observed numbers of gene copies 
were 4.6, 18.6 and 15.1, respectively. These results correlated 
well with those of Southern blotting. The use of this new 
semi-automated technique will make molecular analysis of 
human cancers simpler and more reliable, and should find 
broad applications in clinical and research settings. Int. J. 
Cancer 78:661 -666, 1 998. 
© 1998 ffiley-Uss. Inc. 

Gene amplification plays an important role in the pathogenesis 
of various solid tumors, including breast cancer, probably because 
over-expression of the amplified target genes confers a selective 
advantage. The first technique used to detect genomic amplification 
was cytogenetic analysis. Amplification of several chromosome 
regions, visualized either as extrachromosomal double minutes 
(dmins) or as integrated homogeneously staining regions (HSRs), 
are among the main visible cytogenetic abnormalities in breast 
tumors. Other techniques such as comparative genomic hybridiza- 
tion (CGH) (Kallioniemi etal, 1994) have also been used in broad 
searches for regions of increased DNA copy numbers in tumor 
cells, and have revealed some 20 amplified chromosome regions in 
breast tumors. Positional cloning efforts are underway to identify 
the critical gene(s) in each amplified region. To date, genes known 
to be amplified frequently in breast cancers include myc (8q24), 
ccndl ( 1 1 q 1 3), and erb&2 ( 1 7q 1 2-q2 1 ) (for review, see Bieche and 
Lidereau, 1995). 

Amplification of the myc, ccndl, and erbR2 proto-onco genes 
should have clinical relevance in breast cancer, since independent 
studies have shown that these alterations can be used to identify 
sub-populations with a worse prognosis (Berns et aL, 1992; 
Schuuring et al., 1992; Slamon et a!., 1987). Muss et aL (1994) 
suggested that these gene alterations may also be useful for the 
prediction and assessment of the efficacy of adjuvant chemotherapy 
and hormone therapy. 

However, published results diverge both in terms of the fre- 
quency of these alterations and their clinical value. For instance, 
over 500 studies in 10 years have failed to resolve the controversy 



surrounding the link suggested by Slamon et al (1987) between 
erbBl amplification and disease progression. These discrepancies 
are partly due to the clinical, histological and ethnic heterogeneity 
of breast cancer, but technical considerations are also probably 
involved. 

Specific genes (DNA) were initially quantified in tumor cells by 
means of blotting procedures such as Southern and slot blotting. 
These batch techniques require large amounts of DNA (5-10 
ug/reaction) to yield reliable quantitative results. Furthermore, 
meticulous care is required at all stages of the procedures to 
generate blots of sufficient quality for reliable dosage analysis. 
Recently, PCR has proven to be a powerful tool for quantitative 
DNA analysis, especially with minimal starting quantities of tumor 
samples (small, early-stage tumors and formalin-fixed, paraffin- 
embedded tissues). 

Quantitative PCR can be performed by evaluating the amount of 
product either after a given number of cycles (end-point quantita- 
tive PCR) or after a varying number of cycles during the 
exponential phase (kinetic quantitative PCR). In the first case, an 
internal standard distinct from the target molecule is required to 
ascertain PCR efficiency. The method is relatively easy but implies 
generating, quantifying and storing an internal standard for each 
gene studied. Nevertheless, it is the most frequently applied 
method to date. 

One of the major advantages of the kinetic method is its rapidity 
in quantifying a new gene, since no internal standard is required (an 
external standard curve is sufficient). Moreover, the kinetic method 
has a wide dynamic range (at least 5 orders of magnitude), giving 
an accurate value for samples differing in their copy number. 
Unfortunately, the method is cumbersome and has therefore been 
rarely used. It involves aliquot sampling of each assay mix at 
regular intervals and quantifying, for each aliquot, the amplifica- 
tion product. Interest in the kinetic method has been stimulated by a 
novel approach using fluorescent TaqMan methodology and a new 
instrument (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real time (Gibson et aL. 1996; Heid et 
al., 1996). The TaqMan reaction is based on the 5' nuclease assay 
first described by Holland et al. (1991). The latter uses the 5' 
nuclease activity of Taq polymerase to cleave a specific fluorogenic 
oligonucleotide probe during the extension phase of PCR. The 
approach uses dual-labeled fluorogenic hybridization probes (Lee 
et aL, 1993). One fluorescent dye, co-valently linked to the 5' end 
of the oligonucleotide, serves as a reporter [FAM (i.e., 6-carboxy- 
fluorescein)] and its emission spectrum is quenched by a second 
fluorescent dye, TAMRA (i.e., 6-carboxy-tetramethyl-rhodamine) 
attached to the 3' end. During the extension phase of the PCR 
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cycle, the fluorescent hybridization probe is hydrolyzed by the 
5'-3' nucleolytic activity of DNA polymerase. Nuclease degrada- 
tion of the probe releases the quenching of FAM fluorescence 
emission, resulting in an increase in peak fluorescence emission. 
The fluorescence signal is normalized by dividing the emission 
intensity of the reporter dye (FAM) by the emission intensity of a 
reference dye (i.e., ROX, 6-carboxy-X-rhodamine) included in 
TaqMan buffer, to obtain a ratio defined as the Rn (normalized 
reporter) for a given reaction tube. The use of a sequence detector 
enables the fluorescence spectra of all 96 wells of the thermal 
cycler to be measured continuously during PCR amplification. 

The real -time PCR method offers several advantages over other 
current quantitative PCR methods (Celi et ai, 1994): (i) the 
probe-based homogeneous assay provides a real-time method for 
detecting only specific amplification products, since specific hybri- 
dation of both the primers and the probe is necessary to generate a 
signal; (ii) the C, (threshold cycle) value used for quantification is 
measured when PCR amplification is still in the log phase of PCR 
product accumulation. This is the main reason why Q is a more 
reliable measure of the starting copy number than are end-point 
measurements, in which a slight difference in a limiting component 
can have a drastic effect on the amount of product; (Hi) use of C, 
values gives a wider dynamic range (at least 5 orders of magni- 
tude), reducing the need for serial dilution; (iv) The real-time PCR 
method is run in a closed-tube system and requires no post-PCR 
sample handling, thus avoiding potential contamination; (v) the 
system is highly automated, since the instrument continuously 
measures fluorescence in all 96 wells of the thermal cycler during 
PCR amplification and the corresponding software processes, and 
analyzes the fluorescence data; (vi) the assay is rapid, as results are 
available just one minute after thermal cycling is complete; (vii) the 
sample throughput of the method is high, since 96 reactions can be 
analyzed in 2 hr. 

Here, we applied this semi-automated procedure to determine 
the copy numbers of the 3 most frequently amplified genes in breast 
tumors (myc, ccndl and erbBl), as well as 2 genes (alb and app) 
located in a chromosome region in which no genetic changes have 
been observed in breast tumors. The results for 108 breast tumors 
were compared with previous Southern-blot data for the same 
samples. 

MATERIAL AND METHODS 
Tumor and blood samples 

Samples were obtained from 1 08 primary breast tumors removed 
surgically from patients at the Centre Rene Huguenin; none of the 
patients had undergone radiotherapy or chemotherapy. Immedi- 
ately after surgery, the tumor samples were placed in liquid 
nitrogen until extraction of high-molecular-weight DNA. Patients 
were included in this study if the tumor sample used for DNA 
preparation contained more than 60% of tumor cells (histological 
analysis). A blood sample was also taken from 18 of the same 
patients. 

DNA was extracted from tumor tissue and blood leukocytes 
according to standard methods. 

Real-time PCR 

Theoretical basis. Reactions are characterized by the point 
during cycling when amplification of the PCR product is first 
detected, rather than by the amount of PCR product accumulated 
after a fixed number of cycles. The higher the starting copy number 
of the genomic DNA target, the earlier a significant increase in 
fluorescence is observed. The parameter C, (threshold cycle) is 
defined as the fractional cycle number at which the fluorescence 
generated by cleavage of the probe passes a fixed threshold above 
baseline. The target gene copy number in unknown samples is 
quantified by measuring Q and by using a standard curve to 
determine the starting copy number. The precise amount of 
genomic DNA (based on optical density) and its quality (i.e., lack 



of extensive degradation) are both difficult to assess. We therefore 
also quantified a control gene (alb) mapping to chromosome region 
4qll-ql3. in which no genetic alterations have been found in 
breast-tumor DNA by means of CGH (Kallioniemi et ai, 1 994). 

Thus, the ratio of the copy number of the target gene to the copy 
number of the alb gene normalizes the amount and quality of 
genomic DNA. The ratio defining the level of amplification is 
termed "N", and is determined as follows: 

copy number of target gene {app, myc, ccndl. erbB2) 

= ! " . 

copy number of reference gene (alb) 

Primers, probes, reference human genomic DNA and PCR 
consumables. Primers and probes were chosen with the assistance 
of the computer programs Oligo 4.0 (National Biosciences, Ply- 
mouth, MN), EuGene (Daniben Systems, Cincinnati, OH) and Primer 
Express (Perkin-Elmer Applied Biosystems, Foster City, CA). 

Primers were purchased from DNAgency (Malvern, PA) and 
probes from Perkin-Elmer Applied Biosystems. 

Nucleotide sequences for the oligonucleotide hybridization 
probes and primers are available on request. 

The TaqMan PCR Core reagent kit, MicroAmp optical tubes, 
and MicroAmp caps were from Perkin-Elmer Applied Biosystems. 

Standard-curve construction. The kinetic method requires a 
standard curve. The latter was constructed with serial dilutions of 
specific PCR products, according to Piatak et ai (1993). In 
practice, each specific PCR product was obtained by amplifying 20 
rig of a standard human genomic DNA (Boehringer, Mannheim, 
Germany) with the same primer pairs as those used later for 
real-time quantitative PCR. The 5 PCR products were purified 
using MicroSpin S-400 HR columns (Pharmacia, Uppsala, Swe- 
den) electrophorezed through an acrylamide gel and stained with 
ethidium bromide to check their quality. The PCR products were 
then quantified spectrophotometrically and pooled, and serially 
diluted 10-fold in mouse genomic DNA (Clontech, Palo Alto, CA) 
at a constant concentration of 2 ng/ul. The standard curve used for 
real-time quantitative PCR was based on serial dilutions of the pool 
of PCR products ranging from 10" 7 (10 5 copies of each gene) to 
10 _, ° (10 2 copies). This series of diluted PCR products was 
aliquoted and stored at -80°C until use. 

The standard curve was validated by analyzing 2 known 
quantities of calibrator human genomic DNA (20 ng and 50 ng). 

PCR amplification. Amplification mixes (50 ul) contained the 
sample DNA (around 20 ng, around 6600 copies of disomic genes), 
10X TaqMan buffer (5 ul), 200 uM dATP, dCTP, dGTP, and 400 
uM dUTP, 5 mM MgCl 2 , 1.25 units of AmpliTaq Gold, 0.5 units of 
AmpErase uracil N-glycosylase (UNG), 200 nM each primer and 
100 nM probe. The thermal cycling conditions comprised 2 min at 
50°C and 10 min at 95°C. Thermal cycling consisted of 40 cycles at 
95°C for 15 s and 65°C for 1 min. Each assay included: a standard 
curve (from 10 5 to 10 2 copies) in duplicate, a no-template control, 
20 ng and 50 ng of calibrator human genomic DNA (Boehringer) in 
triplicate, and about 20 ng of unknown genomic DNA in triplicate 
(26 samples can thus be analyzed on a 96-well microplate). All 
samples with a coefficient of variation (CV) higher than 10% were 
retested. 

All reactions were performed in the ABI Prism 7700 Sequence 
Detection System (Perkin-Elmer Applied Biosystems), which 
detects the signal from the fluorogenic probe during PCR. 

Equipment for real-lime detection. The 7700 system has a 
built-in thermal cycler and a laser directed via fiber optical cables 
to each of the 96 sample wells. A charge-coupled-device (CDD) 
camera collects the emission from each sample and the data are 
analyzed automatically. The software accompanying the 7700 
system calculates C, and determines the starting copy number in the 
samples. 



GENE AMPLIFICATION BY REAL-TIME PCR 



663 



Determination of gene amplification. Gene amplification was 
calculated as described above. Only samples with an N value 
higher than 2 were considered to be amplified. 

RESULTS 

To validate the method, real-time PCR was performed on 
genomic DNA extracted from 108 primary breast tumors, and 18 
normal leukocyte DNA samples from some of the same patients. 
The target genes were the myc, ccndl and erbB2 proto-oncogenes, 
and the P-amyloid precursor protein gene (app), which maps to a 
chromosome region (21q21.2) in which no genetic alterations have 
been found in breast tumors (Kallioniemi et al, 1994). The 
reference disomic gene was the albumin gene (alb, chromosome 
4qll-ql3). 



Validation of the standard curve and dynamic range 
of real-time PCR 

The standard curve was constructed from PCR products serially 
diluted in genomic mouse DNA at a constant concentration of 
2 ng/ul It should be noted that the 5 primer pairs chosen to analyze 
the 5 target genes do not amplify genomic mouse DNA (data not 
shown). Figure t shows the real-time PCR standard curve for the 
alb gene. The dynamic range was wide (at least 4 orders of 
magnitude), with samples containing as few as 10 2 copies or as 
many as 10 5 copies. 

Copy-number ratio of the 2 reference genes (app and a\b) 

The app to alb copy-number ratio was determined in 18 normal 
leukocyte DNA samples and all 108 primary breast-tumor DNA 
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Figure 1 - Albumin (alb) gene dosage by real-time PCR. Top: Amplification plots for reactions with starting alb gene copy number ranging 
from 10 5 (A9), 10 4 (A7), 10 3 (A4) to 10 2 (A2) and a no-template control (Al). Cycle number is plotted vs. change in normalized reporter signal 
(ARn). For each reaction tube, the fluorescence signal of the reporter dye (FAM) is divided by the fluorescence signal of the passive reference dye 
(ROX), to obtain a ratio defined as the normalized reporter signal (Rn). ARn represents the normalized reporter signal (Rn) minus the baseline 
signal established in the first 15 PCR cycles. ARn increases during PCR as alb PCR product copy number increases until the reaction reaches a 
plateau. C, (threshold cycle) represents the fractional cycle number at which a significant increase in Rn above a baseline signal (horizontal black 
line) can first be detected. Two replicate plots were performed for each standard sample, but the data for only one are shown here. Bottom: 
Standard curve plotting log starting copy number vs. C, (threshold cycle). The black dots represent the data for standard samples plotted in 
duplicate and the red dots the data for unknown genomic DNA samples plotted in triplicate. The standard curve shows 4 orders of linear dynamic 
range. 
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samples. We selected these 2 genes because they are located in 2 
chromosome regions (app, 21q21.2; alb, 4qll-ql3) in which no 
obvious genetic changes (including gains or losses) have been 
observed in breast cancers (Kallioniemi et al. 1994). The ratio for 
the 18 normal leukocyte DNA samples fell between 0.7 and 1.3 
(mean 1.02 ± 0.21), and was similar for the 108 primary breast- 
tumor DNA samples (0.6 to 1.6, mean 1.06 ± 0.25), confirming 
that alb and app are appropriate reference disomic genes for 
breast-tumor DNA. The low range of the ratios also confirmed that 
the nucleotide sequences chosen for the primers and probes were 
not polymorphic, as mismatches of their primers or probes with the 
subject's DNA would have resulted in differential amplification. 

myc, ccndl and erb£2 gene dose in normal leukocyte DNA 

To determine the cut-off point for gene amplification in breast- 
cancer tissue, 1 8 normal leukocyte DNA samples were tested for 
the gene dose (N), calculated as described in "Material and 
Methods'*. The N value of these samples ranged from 0.5 to 1.3 
(mean 0.84 ± 0.22) for mvc, 0.7 to 1.6 (mean 1.06 ± 0.23) for 
ccndl and 0.6 to 1 .3 (mean 0.9 1 ± 0.1 9) for erbB2. Since N values 
for myc, ccndl and erbB2 in normal leukocyte DNA consistently 
fell between 0.5 and 1.6, values of 2 or more were considered to 
represent gene amplification in tumor DNA. 

myc, ccndl and crbB2 gene dose in breast-tumor DNA 

myc, ccndl and erbB2 gene copy numbers in the 108 primary 
breast tumors are reported in Table I. Extra copies of ccndl were 
more frequent (23%, 25/108) than extra copies of erbBl (15%, 
16/108) and myc (10%, 11/108), and ranged from 2 to 18.6 for 
ccndl, 2 to 15.1 for erbB2, and only 2 to 4.6 for the myc gene. 
Figure 2 and Table II represent tumors in which the ccndl gene was 
amplified 16-fold (T145), 6-fold (T133) and non-amplified (Tl 1 8). 
The 3 genes were never found to be co-amplified in the same tumor. 
erbB2 and ccndl were co-amplified in only 3 cases, myc and ccndl 
in 2 cases and myc and erbB2 in 1 case. This favors the hypothesis 
that gene amplifications are independent events in breast cancer. 
Interestingly, 5 tumors showed a decrease of at least 50% in the 
erbB2 copy number (N < 0.5), suggesting that they bore deletions 
of the 17q21 region (the site of erbB2). No such decrease in copy 
number was observed with the other 2 proto-oncogenes. 

. Comparison of gene dose determined by real-time quantitative 
PCR and Southern-blot analysis 

Southern-blot analysis of myc, ccndl and erbB2 amplifications 
had previously been done on the same 1 08 primary breast tumors. A 
perfect correlation between the results of real-time PCR and 
Southern blot was obtained for tumors with high copy numbers 
(N ^ 5). However, there were cases (I myc, 6 ccndl and 4 erbB2) 
in which real-time PCR showed gene amplification whereas 
Southem-blot did not, but these were mainly cases with low extra 
copy numbers (N from 2 to 2.9). 

DISCUSSION 

The clinical applications of gene amplification assays are 
currently limited, but would certainly increase if a simple, standard- 
ized and rapid method were perfected. Gene amplification status 
has been studied mainly by means of Southern blotting, but this 
method is not sensitive enough to detect low-level gene amplifica- 
tion nor accurate enough to quantify the full range of amplification 
values. Southern blotting is also time-consuming, uses radioactive 



TABLE I - DISTRIBUTION OF AMPLIFICATION LEVEL (N) FOR myc, 
ccndl AND crbB2 GENES IN 108 HUMAN BREAST TUMORS 



Gene 




Amplification level (N) 




<0.5 


0.5-1.9 2^.9 


=£5 


myc 


0 


97 (89.8%) 11 (10.2%) 


0 


ccndl 


0 


83 (76.9%) 17(15.7%) 


8 (7.4%) 


erbB2 


5 (4.6%) 


87 (80.6%) 8 (7.4%) 


8 (7.4%) 



reagents and requires relatively large amounts of high-quality 
genomic DNA, which means it cannot be used routinely in many 
laboratories. An amplification step is therefore required to deter- 
mine the copy number of a given target gene from minimal 
quantities of tumor DNA. (small early-stage tumors, cytopuncture 
specimens or formalin-fixed, paraffin-embedded tissues). 

In this study, we validated a PCR method developed for the 
quantification of gene over-representation in rumors. The method, 
based on real-time analysis of PCR amplification, has several 
advantages over other PCR-based quantitative assays such as 
competitive quantitative PCR (Celi et al., 1994). First, the real-time 
PCR method is performed in a closed-tube system, avoiding the 
risk of contamination by amplified products. Re-amplification of 
carryover PCR products in subsequent experiments can also be 
prevented by using the enzyme uracil N-glycosylase (UNG) 
(Longo et al.. 1990). The second advantage is the simplicity and 
rapidity of sample analysis, since no post-PCR manipulations are 
required. Our results show that the automated method is reliable. 
We found it possible to determine, in triplicate, the number of 
copies of a target gene in more than 100 tumors per day. Third, the 
system has a linear dynamic range of at least 4 orders of magnitude, 
meaning that samples do not have to contain equal starting amounts 
of DNA. This technique should therefore be suitable for analyzing 
formalin-fixed, paraffin-embedded tissues. Fourth, and above all, 
real-time PCR makes DNA quantification much more precise and 
reproducible, since it is based on C, values rather than end-point 
measurement of the amount of accumulated PCR product. Indeed, 
the ABI Prism 7700 Sequence Detection System enables C, to be 
calculated when PCR amplification is still in the exponential phase 
and when none of the reaction components is rate-limiting. The 
within-run CV of the C t value for calibrator human DNA (5 
replicates) was always below 5%, and the between-assay precision 
in 5 different runs was always below 10% (data not shown). In 
addition, the use of a standard curve is not absolutely necessary, 
since the copy number can be determined simply by comparing the 
Q ratio of the target gene with that of reference genes. The results 
obtained by the 2 methods (with and without a standard curve) are 
similar in our experiments (data not shown). Moreover, unlike 
competitive quantitative PCR, real-time PCR does not require an 
internal control (the design and storage of internal controls and the 
validation of their amplification efficiency is laborious). 

The only potential disavantage of real-time PCR, like all other 
PCR-based methods and solid-matrix blotting techniques (South- 
ern blots and dot blots) is that is cannot avoid dilution artifacts 
inherent in the extraction of DNA from tumor cells contained in 
heterogeneous tissue specimens. Only FISH and immunohistochem- 
istry can measure alterations on a cell-by-cell basis (Pauletti et al., 
1996; Slamon et al, 1989). However, FISH requires expensive 
equipment and trained personnel and is also time-consuming. 
Moreover, FISH does not assess gene expression and therefore 
cannot detect cases in which the gene product is over-expressed in 
the absence of gene amplification, which will be possible in the 
future by real-time quantitative RT-PCR. Immunohistochemistry is 
subject to considerable variations in the hands of different teams, 
owing to alterations of target proteins during the procedure, the 
different primary antibodies and fixation methods used and the 
criteria used to define positive staining. 

The results of this study are in agreement with those reported in 
the literature. (/) Chromosome regions 4qll-ql3 and 21q2 1.2 
(which bear alb and app, respectively) showed no genetic alter- 
ations in the breast-cancer samples studied here, in keeping with 
the results of CGH (Kallioniemi et al, 1994). (ii) We found that 
amplifications of these 3 oncogenes were independent events, as 
reported by other teams (Bems et al, 1 992; Borg et al, 1992). (Hi) 
The frequency and degree of myc amplification in our breast tumor 
DNA series were lower than those of ccndl and erbB2 amplifica- 
tion, confirming the findings of Borg et al (1992) and Courjal.e/ al. 
(1997). (iv) The maxima of ccndl and erbB2 over-representation 
were 18-fold and 15-fold, also in keeping with earlier results (about 
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Figure 2 - ccndl and alb gene dosage by real-time PCR in 3 breast tumor samples: Tl 18 (El 2, C6, black squares), Tl 33 (Gil , B4, red squares) 
and T 145 (A8, C8, blue squares). Given the C, of each sample, the initial copy number is inferred from the standard curve obtained during the same 
experimenL Triplicate plots were performed for each tumor sample, but the data for only one are shown here. The results are shown in Table 11. 



30-fold maximum) (Bernse/tfA, 1992; Borg et al. 1992; Courjal et et al, 1996). Our results also correlate well with those recently 

al, 1997). (v) The erbB2 copy numbers obtained with real-time published by Gelmini et al (1997), who used the TaqMan system to 

PCR were in good agreement with data obtained with other measure erbWl amplification in a small series of breast tumors 

quantitative PCR-based assays in terms of the frequency and (n = 25), but with an instrument (LS-50B luminescence spectrom- 

degreeofamplification(Anefa/., 1995; Deng et al, 1996; Valeron eter, Perkin-Elmer Applied Biosystems) which only allows end- 
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TABLE I! - EXAMPLES OF ccndl GENE dosage results 
FROM 3 BREAST TUMORS' 



Tumor 










alb 




Uccndl/atb 


Copy 
number 


Mean 


SD 


Copy 
number 


Mean 


SD 


Tl 18 


4525 






4223 










4605 


4603 


77 


4365 


4325 


89 


1.06 




4678 






4387 








T133 


59821 






9787 










61659 


61100 


1111 


10092 


10137 


375 


6.03 




61821 






10533 








T145 


128563 






7321 










125892 


125392 


3448 


7762 


7672 


316 


16.34 




121722 






7933 









'For each sample, 3 replicate experiments were performed and the mean 
and the standard deviation (SD) was determined. The level of ccndl gene 
amplification (Nccndl/alb) is determined by dividing the average ccndl 
copy number value by the average alb copy number value. 



point measurement of fluorescence intensity. Here we report myc 
and ccndl gene dosage in breast cancer by means of quantitative 
PCR. (vi) We found a high degree of concordance between 
real-time quantitative PCR and Southern blot analysis in terms of 
gene amplification, especially for samples with high copy numbers 
(> 5-fold). The slightly higher frequency of gene amplification 
(especially ccndl and erbB2) observed by means of real-time 
quantitative PCR as compared with Southem-blot analysis may be 
explained by the higher sensitivity of the former method. However, 
we cannot rule out the possibility that some tumors with a few extra 



gene copies observed in real-time PCR had additional copies of an 
arm or a whole chromosome (trisomy, tetrasomy or polysomy) 
rather than true gene amplification. These 2 types of genetic 
alteration (polysomy and gene amplification) could be easily 
distinguished in the future by using an additional probe located on 
the same chromosome arm, but some distance from the target gene. 
It is noteworthy that high gene copy numbers have the greatest 
prognostic significance in breast carcinoma (Borg et ai, 1992; 
Slamon e/ a/., 1987). 

Finally, this technique can be applied to the detection of gene 
deletion as well as gene amplification. Indeed, we found a 
decreased copy number of erbB2 (but not of the other 2 proto- 
oncogenes) in several tumors; erbB2 is located in a chromosome 
region (17q21) reported to contain both deletions and amplifica- 
tions in breast cancer (Bieche and Lidereau, 1995). 

In conclusion, gene amplification in various cancers can be used 
as a marker of pre-neoplasia, also for early diagnosis of cancer, 
staging, prognostication and choice of treatment. Southern blotting 
is not sufficiently sensitive, and FISH is lengthy and complex. 
Real-time quantitative PCR overcomes both these limitations, and 
is a sensitive and accurate method of analyzing large numbers of 
samples in a short time. It should find a place in routine clinical 
gene dosage. 
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Genome-wide Study of Gene Copy Numbers, 
Transcripts, and Protein Levels in Pairs of 
Non-invasive and Invasive Human Transitional 
Cell Carcinomas* 

2J3£'fS25 ™ omas Thykiae ' 11, Frederic M - Wa,dma * Hans woK ". 



Gain and loss of chromosomal material is characteristic 
of bladder cancer, as well as malignant transformation in 
general. The consequences of these changes at both the 
transcription and translation levels is at present unknown 
partly because of technical limitations. Here we have at- 
tempted to address this question in pairs of non-invasive 
and invasive human bladder tumors using a combination 
of technology that included comparative genomic hybrid- 
ization, high density oligonucleotide array-based monitor- 
ing of transcript levels (5600 genes), and high resolution 
two-dimensional gel electrophoresis. The results showed 
that there is a gene dosage effect that in some cases 
superimposes on other regulatory mechanisms. This ef- 
fect depended (p < 0.015) on the magnitude of the com- 
parative genomic hybridization change. In general (18 of 
, ™? 8) ' chromosomal areas with more than 2-fold gain 
of DNA showed a corresponding increase in mRNA tran- 
scnpts. Areas with loss of DNA, on the other hand, 
showed either reduced or unaltered transcript levels. Be- 
cause most proteins resolved by two-dimensional gels 
are unknown it was only possible to compare mRNA and 
protein alterations in relatively few cases of well focused 
abundant proteins. With few exceptions we found a good 
correlation (p < 0.005) between transcript alterations and 
protein levels. The implications, as well as limitations, 
of the approach are discussed. Molecular & Cellular 
Proteomlcs 1:37-45, 2002. 



Aneuploidy is a common feature of most human cancers 
(1), but little is known about the genome-wide effect of this 
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phenomenon at both the transcription and translation levels 
High throughput array studies of the breast cancer cell line 
BT474 has suggested that there is a correlation between 
DNA copy numbers and gene expression in highly amplified 
areas (2), and studies of individual genes in solid tumors 
have revealed a good correlation between gene dose and 
mRNA or protein levels in the case of c-erb-B2, cyclin d1 
ems1. and N-myc (3-5). However, a high cyclin D1 protein 
expression has been observed without simultaneous am- 
plification (4), and a low level of c-myc copy number in- 
crease was observed without concomitant c-myc protein 
overexpression (6). 

In human bladder tumors, karyotyping, fluorescent in situ 
hybridization, and comparative genomic hybridization (CGH) 1 
have revealed chromosomal aberrations that seem to be 
characteristic of certain stages of disease progression. In the 
case of non-invasive pTa transitional cell carcinomas (TCCs), 
this includes loss of chromosome 9 or parts of it, as well as 
loss of Y in males. In minimally invasive pT1 TCCs, the fol- 
lowing alterations have been reported: 2q-, Hp-, i q +, 
11q13+, 17q+. and 20q+ (7-12). It has been suggested that 
these regions harbor tumor suppressor genes and onco- 
genes; however, the large chromosomal areas involved often 
contain many genes, making meaningful predictions of the 
functional consequences of losses and gains very difficult. 

In this investigation we have combined genome-wide tech- 
nology for detecting genomic gains and losses (CGH) with 
gene expression profiling techniques (microarrays and pro- 
teomlcs) to determine the effect of gene copy number on 
transcript and protein levels in pairs of non-invasive and in- 
vasive human bladder TCCs. 

EXPERIMENTAL PROCEDURES 
Mafe/7a/-Bladder tumor biopsies were sampled after informed 
consent was obtained and after removal of tissue for routine pathol- 
ogy examination. By light microscopy tumors 335 and 532 were 
staged by an experienced pathologist as pTa (superficial papillary), 

1 The abbreviations used are: CGH, comparative genomic hybrid- 
Kat.on; TCC, transitional cell carcinoma; LOH, loss of heterozygosity •'' 
PA-FABP, psoriasis-associated fatty acid-binding protein; 20/ ' 
two-dimensional. 7 
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Fig. 1 . DNA copy number and mRNA expression level. Shown from left to right are chromosome {Chr.), CGH profiles, gene location and 
expression level of specific genes, and overall expression level along the chromosome. A, expression of mRNA in invasive tumor 733 as 
compared with the non-invasive counterpart tumor 335. B, expression of mRNA in invasive tumor 827 compared with the non-invasive 
counterpart tumor 532. The average fluorescent signal ratio between tumor DNA and normal DNA is shown along the length of the chromosome 
{left). The bold curve in the ratio profile represents a mean of four chromosomes and is surrounded by thin curves indicating one standard 
deviation. The central vertical line {broken) indicates a ratio value of 1 (no change), and the vertical lines next to it {dotted) indicate a ratio of 
0.5 {left) and 2.0 {right). In chromosomes where the non-invasive tumor 335 used for comparison showed alterations in DNA content, the ratio 
profile of that chromosome is shown to the right of the invasive tumor profile. The colored bars represents one gene each, identified by the 
running numbers above the bars (the name of the gene can be seen at www.MDL.DK/sdata.html). The oars indicate the purported location of 
the gene, and the colors indicate the expression level of the gene in the invasive tumor compared with the non-invasive counterpart; >2-fold 
increase {black), >2-fold decrease {blue), no significant change {orange). The oar to the far right, entitled Expression shows the resulting change 
in expression along the chromosome; the colors indicate that at least half of the genes were up-regulated {black), at least half of the genes 
down-regulated {blue), or more than half of the genes are unchanged {orange). If a gene was absent in one of the samples and present in 
another, it was regarded as more than a 2-fold change. A 2-fold level was chosen as this corresponded to one standard deviation in a double 
determination of -1800 genes. Centromeres and heterochromatic regions were excluded from data analysis. 



grade I and II, respectively, tumors 733 and 827 were staged as pT1 
(invasive into submucosa), 733 was staged as solid, and 827 was 
staged as papillary, both grade III. 

mRNA Preparation— Tissue biopsies, obtained fresh from surgery, 
were embedded immediately in a sodium-guanidinium thiocyanate 
solution and stored at -80 °C. Total RNA was isolated using the 
RNAzol B RNA isolation method (WAK-Chemie Medical GMBH). 
poly(A) + RNA was isolated by an oligo(dT) selection step (Oligotex 
mRNA kit; Qiagen). 

cRNA Preparation— 1 pug of mRNA was used as starting material. 
The first and second strand cDNA synthesis was performed using the 
Superscript® choice system (Invitrogen) according to the manufac- 
turer's instructions but using an oligo(dT) primer containing a T7 RNA 
polymerase binding site. Labeled cRNA was prepared using the ME- 
GAscrip® in vitro transcription kit (Ambion). Biotin-labeled CTP and 



UTP (Enzo) was used, together with unlabeled NTPs in the reaction. 
Following the in vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (Qiagen). 

Array Hybridization and Scanning— Array hybridization and scan- 
ning was modified from a previous method (13). 10 ^g of cRNA was 
fragmented at 94 °C for 35 min in buffer containing 40 mM Tris 
acetate, pH 8;1, 100 mM KOAc, 30 mM MgOAc. Prior to hybridization, 
the fragmented cRNA in a 6x SSPE-T hybridization buffer (1 m NaCI, 
10 mM Tris, pH 7.6, 0.005% Triton), was heated to 95 °C for 5 min, 
subsequently cooled to 40 °C, and loaded onto the Affymetrix probe 
array cartridge. The probe array was then incubated for 1 6 h at 40 °C 
at constant rotation (60 rpm). The probe array was exposed to 10 
washes in 6x SSPE-T at 25 °C followed by 4 washes in 0.5x SSPE-T 
at 50 °C, The biotinylated cRNA was stained with a streptavidin- 
phycoerythrin conjugate, 10 /ig/ml (Molecular Probes) in 6x SSPE-T 
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for 30 min at 25 °C followed by 1 0 washes in 6x SSPE-T at 25 °C. The 
probe arrays were scanned at 560 nm using a confocal laser scanning 
microscope (made for Affymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsateilite /toa/ys/s- Microsateilite Analysis was performed as 
described previously (14). Microsatellites were selected by use of 
www.ncbi.nlm.nih.gov/genemap98, and primer sequences were ob- 
tained from the genome data base at www.gdb.org. DNA was extracted 
from tumor and blood and amplified by PCR in a volume of 20 id for 35 
cycles. The amplicons were denatured and electrophoresed for 3 h in an 
ABI Prism 377. Data were collected in the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined as less than 33% 
of one allele detected in tumor amplicons compared with blood. 

Proteomic Analysis— TCCs were minced into small pieces and 
homogenized in a small glass homogenizer in 0.5 ml of lysis solution. 
Samples were stored at -20 °C until use. The procedure for 2D gel 
electrophoresis has been described in detail elsewhere (1 5, 1 6). Gels 
were stained with silver nitrate and/or Coomassie Brilliant Blue. Pro- 
teins were identified by a combination of procedures that included 
microsequencing, mass spectrometry, two-dimensional gel Western 
immunoblotting, and comparison with the master two-dimensional gel 
image of human keratinocyte proteins; see biobase.dk/cgi-bin/celis. 

CGH- Hybridization of differentially labeled tumor and normal DNA 
to normal metaphase chromosomes was performed as described 
previously (10). Fluorescein-labeled tumor DNA (200 ng), Texas Red- 



labeled reference DNA (200 ng), and human Cot-1 DNA (20 jig) were 
denatured at 37 °C for 5 min and applied to denatured normal met- 
aphase slides. Hybridization was at 37 °C for 2 days. After washing, 
the slides were counterstained with 0.15 MQ/ml 4,6-diamidino-2-phe- 
nylindole in an anti-fade solution. A second hybridization was per- 
formed for all tumor samples using fluorescein-labeled reference DNA 
and Texas Red-labeled tumor DNA (inverse labeling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital image analysis was 
used to identify chromosomal regions with abnormal fluorescence 
ratios, indicating regions of DNA gains and losses. The average 
green :red fluorescence intensity ratio profiles were calculated using 
four images of each chromosome (eight chromosomes total) with 
normalization of the green: red fluorescence intensity ratio for the 
entire metaphase and background correction. Chromosome identifi- 
cation was performed based on 4,6-diamidino-2-phenylindole band- 
ing patterns. Only images showing uniform high intensity fluores- 
cence with minimal background staining were analyzed. All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matic regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hybridization— The CGH analysis 
identified a number of chromosomal gains and losses in the 
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Table I 

Correlation between alterations detected by CGH and by expression monitoring 

Top, CGH used as independent variable (if CGH alteration - what expression ratio was found); bottom, altered expression used as 
independent variable (if expression alteration - what CGH deviation was found). 



CGH alterations 
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Concordance CGH alterations 
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Expression change clusters 
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13 Gain 
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10 Up-regulation 

0 Down-regulation 

3 No change 

1 Up-regulation 

5 Down-regulation 
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77% 



50% 



10 Gain 



12 Loss 



8 Up-regulation 
0 Down-regulation 
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3 Up-regulation 
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7 No change 



80% 
17% 



Expression change clusters 


Tumor 733 vs. 335 


Concordance 


Expression change clusters 


Tumor 827 vs. 532 


Concordance 


CGH alterations 


CGH alterations 


16 Up-regulation 


11 Gain 


69% 


17 Up-regulation 


10 Gain 


59% 


2 Loss 






5 Loss 






3 No change 






2 No change 




21 Down-regulation 


1 Gain 


38% 


9 Down-regulation 


0 Gain 


33% 


8 Loss 






3 Loss 






12 No change 






6 No change 




15 No change 


3 Gain 


60% 


21 No change 


1 Gain 


81% 


3 Loss 






3 Loss 






9 No change 






17 No change 





two invasive tumors (stage pT1 , TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa, TCCs 335 and 
532) showed only 9p-, 9q22-q33-, and X-, and 7+, 9q-, 
and Y-, respectively. Both invasive tumors showed changes 
(1q22-24+, 2q14.1-qter-, 3q12-q13.3-, 6q12-q22-, 
9q34+, 11q12-q13+, 17 + , and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TCC 733 and 0.3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 in TCC 733 (Fig. 1 A) and 
20q12inTCC 827 (Fig. 18). 

mRNA Expression in Relation to DNA Copy Number— The 
mRNA levels from the two invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparts 
(TCCs 532 and 335). This was done in two separate experi- 
ments in which we compared TCCs 733 to 335 and 827 to 
532, respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1,800 genes that yielded a signal on the arrays 
were searched in the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the level in 
the invasive versus the non-invasive counterpart. Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the CGH method is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are not known at high resolution. 

Two sets of calculations were made from the data. For the 
first set we used CGH alterations as the independent variable 
and estimated the frequency of expression alterations in these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25, 2p and 9q, showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels in the two tumor pairs (Fig. 
1), In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts in both 
TCCs 733 (77%) and 827 (80%) (Table I, top). Chromosomal 
losses, on the other hand, were not accompanied by de- 
creased expression in several cases, and were often regis- 
tered as having unaltered RNA levels (Table I, top). The inabil- 
ity to detect RNA expression changes in these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the independent variable and es- 
timated the frequency of CGH alterations in these areas. As 
above, we found that increased transcript expression corre- 
lated with gain of chromosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often de- 
tected in areas with unaltered CGH ratios (Table I, bottom). 
Furthermore, as a control we looked at areas with no alter- 
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Fig. 2. Correlation between maximum CGH aberration and the ability to detect expression change by oligonucleotide array 
monitoring. The aberration is shown as a numerical -fold change in ratio between invasive tumors 827 (A) and 733 (♦) and their non-invasive 
counterparts 532 and 335. The expression change was taken from the Expression line to the right in Fig. 1, which depicts the resulting 
expression change for a given chromosomal region. At least half of the mRNAs from a given region have to be either up- or down-regulated 
to be scored as an expression change. All chromosomal arms in which the CGH ratio plus or minus one standard deviation was outside the 
ratio value of one were included. 



atlon in expression. No alteration was detected by CGH in 
most of these areas (TCC 733 ? .60% and TCC 827, 81 %; see 
Table I, bottom). Because the ability to observe reduced or 
increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes, we plotted the maximum CGH aberrations* In the 
regions showing CGH changes against the ability to detect a 
change in mRNA expression as monitored by the oligonucleo- 
tide arrays (Fig. 2). For both tumors TCC 733 (p < 0.015) and 
TCC 827 (p < 0.00003) a highly significant correlation was 
observed between the level of CGH ratio change (reflecting 
the DNA copy number) and alterations detected by the array 
based technology (Fig. 2). Similar data were obtained when 
areas with altered expression were used as independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1.6- to 2.0-fold (Table I, bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction in expres- 
sion level, which is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent. 

Microsatellite-based Detection of Minor Areas of Loss- 
es -In TCC 733, several chromosomal areas exhibiting DNA 
amplification were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Fig. 1 , TCC 733 
chromosome 1q32, 2p21, and 7q21 and q32, 9q34, and 
10q22). To determine whether these results were because of 
undetected loss of chromosomal material in these regions or 



because of other non-structural mechanisms regulating tran- 
scription, we examined two microsatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Fig. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas showing increased expression 
there was a correlation with the DNA copy number alterations 
(Fig. 1/ty As indicated above, the mRNA decrease observed in 
the middle of the chromosomal gain was because of LOH, 
implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome 11p showed a normal ratio in the CGH analysis; 
however, clusters of five and three genes, respectively, lost 
their expression. Two microsatellites (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsin D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24, 11p11, 12p12.2, 12q21.1, and 16q24 
and in TCC 827 at chromosome 11p15.5, 12p11, 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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Fig. 3. Microsatellite analysis of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25, detected 
(a) by D1S215 close to Hu class I histocompatibility antigen (gene 
number 38 in Fig. 1), (b) by D1S2735 close to cathepsin E (gene 
number 41 in Fig. 1), and (c) at chromosome 2p23 by D2S2251 close 
to general ^-spectrin (gene number 1 1 on Fig. 1) and of (d) tumor 827 
showing loss of heterozygosity at chromosome 18q12 by S18S1118 
close to mitochondrial 3-oxoacyl-coenzyme A thiolase (gene number 
12 in Fig. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (A/), and the lower curves show the 
electropherogram from tumor DNA (7). In all cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH (Fig. 3), suggesting that 
transcriptional down-regulation of genes in the other regions 
may be controlled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels— 
2D-PAGE analysis, in combination with Coomassie Brilliant 
Blue and/or silver staining, was carried out on all four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 
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Fig. 4. Correlation between protein levels as judged by 20- 
PAGE and transcript ratio. For comparison proteins were divided in 
three groups, unaltered in level or up- or down-regulated (horizontal 
axis). The mRNA ratio as determined by oligonucleotide arrays was 
plotted for each gene (vertical axis). ▲, mRNAs that were scored as 
present in both tumors used for the ratio calculation; A, mRNAs that 
were scored as absent in the invasive tumors (along horizontal axis) or 
as absent in non-invasive reference (top of figure). Two different 
scalings were used to exclude scaling as a confounder, TCCs 827 
and 532 (A A) were scaled with background suppression, and TCCs 
733 and 335 (#0) were scaled without suppression; Both compari- 
sons showed highly significant (p < 0.005) differences in mRNA ratios 
between the groups. Proteins shown were as follows: Group A (from 
left), phosphoglucomutase 1, glutathione transferase class ^ number 
4, fatty acid-binding protein homologue, cytokeratin 15, and cyto- 
keratin 13; B (from left), fatty acid-binding protein homologue, 28-kDa 
heat shock protein, cytokeratin 13, and calcyclin; C (from left), or-eno- 
lase, hnRNP B1, 28-kDa heat shock protein, 14-3-3-c, and 
pre-mRNA splicing factor; D, mesothelial keratin K7 (type II); E (from 
top), glutathione S-transferase-7r and mesothelial keratin K7 (type II); 
F(from top and left), adenylyl cyclase-associated protein, E-cadherin, 
keratin 19, calgizzarin, phosphoglycerate mutase, annexin IV, cy- 
toskeletal 7-actin, hnRNP A1, integral membrane protein calnexin 
(1P90), hnRNP H, brain-type clathrin light chain-a, hnRNP F, 70-kDa 
heat shock protein, heterogeneous nuclear ribonucleoprotein A/B, 
translationally controlled tumor protein, liver glyceraldehyde-3-phos- 
phate dehydrogenase, keratin 8, aldehyde reductase, and Na,K- 
ATPase 0-1 subunit; G, (from top and /eft), TCP20, calgizzarin, 70- 
kDa heat shock protein, calnexin, hnRNP H, cytokeratin 15, ATP 
synthase, keratin 19, triosephosphate isomerase, hnRNP F, liver glyc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transfer- 
ase-ir, and keratin 8; H (from left), plasma gelsolin, autoantigen cal- 
reticulin, thioredoxin, and NAD+-dependent 15 hydroxyprostaglandin 
dehydrogenase; / (from top), prolyl 4-hydroxylase 0-subunit, cyto- 
keratin 20, cytokeratin 17, prohibition, and fructose 1,6-biphos- 
phatase; J annexin II; K, annexin IV; L (from top and /eft), 90-kDa heat 
shock protein, prolyl 4-hydroxylase /3-subunit, a-enolase, GRP 78, 
cyclophilin, and cofilin. 

gradient, and having a known chromosomal location, were 
selected for analysis in the TCC pair 827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Fig. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fig. 5. Comparison of protein and transcript levels in invasive 
and non-invasive TCCs. The upper part of the figure shows a 2D gel 
(left) and the oligonucleotide array (right) of TCC 532. The red rectan- 
gles on the upper gel highlight the areas that are compared below. 
Identical areas of 2D gels of TCCs 532 and 827 are shown below. 
Clearly, cytokeratins 13 and 15 are strongly down-regulated in TCC 
827 (red annotation). The tile on the array containing probes for 
cytokeratin 15 is enlarged below the array (red arrow) from TCC 532 
and is compared with TCC 827. The upper row of squares in each tile 
corresponds to perfect match probes; the lower row corresponds to 
mismatch probes containing a mutation (used for correction for un- 
specific binding). Absence of signal is depicted as black, and the 
higher the signal the lighter the color. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratin 13, a high 
transcript level was also present in TCC 532 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 2D gels at 
the bottom of the figure (left) show levels of PA-FABP and adipocyte- 
FABP in TCCs 335 and 733 (invasive), respectively. Both proteins are 
down-regulated in the invasive tumor. To the right we show the array 
tiles for the PA-FABP transcript. A medium transcript level was de- 
tected in the case of TCC 335 (1277 units) whereas very low levels 
were detected in TCC 733 (166 units). IEF, isoelectric focusing. 



keratins encoded by genes on chromosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chromosomal 
location were detected in TCCs 733 and 335, and of these 19 
correlated (p < 0.005) with the mRNA changes detected using 
the arrays (Fig. 4). For example, PA-FABP was highly ex- 
pressed in the non-invasive TCC 335 but lost in the invasive 
counterpart (TCC 733; see Fig. 5). The smaller number of 
proteins detected in both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

1 1 chromosomal regions where CGH showed aberrations 
that corresponded to the changes in transcript levels also 
showed corresponding changes in the protein level (Table II). 
These regions included genes that encode proteins that are 
found to be frequently altered in bladder cancer, namely 
cytokeratins 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1. Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chromosomal area in invasive bladder 
cancers. 

DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-invasive 
and invasive TCCs using high throughput expression arrays 
and proteomics, in combination with CGH. In general, the 
results showed that there is a clear individual regulation of the 
mRNA expression of single genes, which in some cases was 
superimposed by a DNA copy number effect. In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased mRNA expression, whereas areas showing 
losses showed either no change or a reduced mRNA expres- 
sion. The latter might be because of the fact that losses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 



Table II 



Proteins whose expression level correlates with both mRNA and gene dose changes 


Protein 


Chromosomal location 


Tumor TCC 


CGH alteration 


Transcript alteration" 


Protein alteration 


Annexin II 


1q21 


733 


Gain 


Abs to Pres a 


Increase 


Annexin IV 


2p13 


733 


Gain 


3.9-Fold up 


Increase 


Cytokeratin 17 


17q12-q21 


827 


Gain 


3.8-Fold up 


Increase 


Cytokeratin 20 


17q21.1 


827 


Gain 


5.6-Fold up 


Increase 


(PA-)FABP 


8q21.2 


827 


Loss 


10-Fold down 


Decrease 


FBP1 


9q22 


827 


Gain 


2.3-Fold up 


Increase 


Plasma gelsolin 


9q31 


827 


Gain 


Abs to Pres 


Increase 


Heat shock protein 28 


15q12-q13 


827 


Loss 


2.5-Fold up 


Decrease 


Prohibitin 


17q21 


827/733 


Gain 


3.7-/2.5-Fold up 6 


Increase 


Prolyl-4-hydroxyl 


17q25 


827/733 


Gain 


5.7-/1 .6-Fold up 


Increase 


hnRNPBI 


7p15 


827 


Loss 


2.5-Fold down 


Decrease 



* Abs, absent; Pres, present. 

6 In cases where the corresponding alterations were found in both TCCs 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DNA copy number was 
associated with de novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected in the non-invasive tumor but were present at rela- 
tively high levels in areas with DNA amplifications in the inva- 
sive tumors (e.g. in TCC 733 transcript from cellular ligand of 
annexin II gene (chromosome 1q21) from absent to 2670 
arbitrary units; in TCC 827 transcript from small proline-rich 
protein 1 gene (chromosome 1q12-q21.1) from absent to 
1326 arbitrary units). It may be anticipated from these data 
that significant clustering of genes with an increased expres- 
sion to a certain chromosomal area indicates an increased 
likelihood of gain of chromosomal material in this area. 

Considering the many possible regulatory mechanisms act- 
ing at the level of transcription, it seems striking that the gene 
dose effects were so clearly detectable in gained areas. One 
hypothetical explanation may He in the loss of controlled 
methylation in tumor cells (17-19). Thus, it may be possible 
that in chromosomes with increased DNA copy numbers two 
or more alleles could be demethylated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the process (20, 21). A recent report has documented a 
ploidy regulation of gene expression in yeast, but in this case all 
the genes were present in the same ratio (22), a situation that is 
not analogous to that of cancer cells, which show marked 
chromosomal aberrations, as well as gene dosage effects. 

Several CGH studies of bladder cancer have shown that 
some chromosomal aberrations are common at certain 
stages of disease progression, often occurring in more than 1 
of 3 tumors. In pTa tumors, these include 9p-, 9q-, 1 q+, Y- 
(2, 6), and in pT1 tumors, 2q-,11p-, 11q-, 1q+, 5p+, 8q+, 
17q+, and 20q+ (2-4, 6, 7). The pTa tumors studied here 
showed similar aberrations such as 9p- and 9q22-q33- and 
9q- and Y-, respectively. Likewise, the two minimal invasive 
pT1 tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remarkable resemblance to the 
commonly seen pattern of losses and gains, such as 1q22-24 
amplification (seen in both tumors), 11q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and 1q+ and 9p-, often 
linked to 20q+ and 11 q13+ (both tumors) (7-9). These ob- 
servations indicate that the pairs of tumors used in this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general importance for 
bladder cancer. 

Considering that the mapping resolution of CGH is of about 
20 megabases it is only possible to get a crude picture of 
chromosomal instability using this technique. Occasionally, 
we observed reduced transcript levels close to or inside re- 
gions with increased copy numbers. Analysis of these regions 
by positioning heterozygous microsatellites as close as pos- 
sible to the locus showing reduced gene expression revealed 
loss of heterozygosity in several cases. It seems likely that 
multiple and different events occur along each chromosomal 



arm and that the use of cDNA microarrays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent at the boundaries of the CGH 
aberrations. At present we do not know the mechanism be- 
hind chromosomal aneuploidy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic imprinting has 
an impact on the expression level in normal cells and is often 
reduced in tumors. However, the relation between imprinting 
and gain of chromosomal material is not known. 

We regard it as a strength of this investigation that we were 
able to compare invasive tumors to benign tumors rather than 
to normal urothelium, as the tumors studied were biologically 
very close and probably may represent successive steps in 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available it was possible to apply three different 
state of the art methods. The observed correlation between 
DNA copy number and mRNA expression is remarkable when 
one considers that different pieces of the tumor biopsies were 
used for the different sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarkable 
similarity even between tumors and distant metastasis (10, 23). 

In the few cases analyzed, mRNA and protein levels 
showed a striking correspondence although in some cases 
we found discrepancies that may be attributed to translation^ 
regulation, post-translational processing, protein degrada- 
tion, or a combination of these. Some transcripts belong to 
undertranslated mRNA pools, which are associated with few 
translationally inactive ribosomes; these pools, however, 
seem to be rare (24). Protein degradation, for example, may 
be very important in the case of polypeptides with a short 
half-life (e.g. signaling proteins). A poor correlation between 
mRNA and protein levels was found in liver cells as deter- 
mined by arrays and 2D-PAGE (25), and a moderate correla- 
tion was recently reported by Ideker et a/. (26) in yeast. 

Interestingly, our study revealed a much better correlation 
between gained chromosomal areas and increased mRNA 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general/the level of CGH change determined 
the ability to detect a change in transcript. One possible 
explanation could be that by losing one allele the change in 
mRNA level is not so dramatic as compared with gain of 
material, which can be rather unlimited and may lead to a 
severalfold increase in gene copy number resulting in a much 
higher impact on transcript level. The latter would be much 
easier to detect on the expression arrays as the cut-off point 
was placed at a 2-fold level so as not to be biased by noise on 
the array. Construction of arrays with a better signal to noise 
ratio may in the future allow detection of lesser than 2-fold 
alterations in transcript levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on tran- 
script levels. 
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In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression, and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring immunoidentification apd/or mass spectrometry to 
correctly identify the proteins in the gels. 

In conclusion, the results presented in this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH method, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes with 
known locations, and 2D gel analysis to obtain information at 
the post-translational level, a clearer and more developed un- 
derstanding of the tumor genome will be forthcoming. 
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ABSTRACT 

Genetic changes underlie tumor progression and may lead to cancer- 
specifle expression of critical genes. Over 1100 publications have de- 
scribed the use of comparative genomic hybridization (CGH) to analyze 
-the pattern ofcopynumberalteratlons in cancer, but very few of the genes 
affected are known, Here, we performed high-resolution CGH analysis on 
cDNA mlcroarrays in breast cancer and directly compared copy number 
and mRNA expression levels of 13,824 genes to quantitate the impact of 
genomic changes on gene expression* We identified and mapped the 
boundaries of 24 Independent ampDcons, ranging In size from 0.2 to 12 
Mb* Throughout the genome, both high- and low-level copy number 
changes bad a substantial Impact on gene expression, with 44% of the 
highly amplified genes showing overexpression and 10.5% of the highly 
overexpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were systematically attributable to gene amplification. These 
included most previously described amplified genes la breast cancer and 
many novel targets for genomic alterations, Including the HOXB7 gene, 
the presence of which In a novel amplkon at 17q2U was validated In 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA mlcroarrays revealed hundreds of 
novel geoes whose overexpression Is attributable to geae amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene expression patterns revealed by cDNA raicroarrays have 
facilitated classification of cancers into biologically distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the c^^^^^mzff^ ampli- 2Q DNA amplification have been mapped in 

Ged oncogene*, such a* and EGFR (7 8), m brea* cancer and m ^ I0) . However, these amplkons areoften 

other solid tumors. Besuies amphficafons of known oncogenes, oyer l8va ^ poorJyde ^ and ^ ri ^ OT ^ e ^^ m ^ M 

unknown. 

t?$!££3 w». in PU , by .h. p.^ of ^ We hypothesized that genome-wide identification of those gene 

charge*. Thii artidc must therefore be hereby marked advertisement in accordance with expression changes that are attributable to underlying gene copy 

18 OSX. Section 1734 solely to indicate this fact number alterations would highlight transcripts that are actively in- 

c^^T^^ ^^^c^^I^ volved in the causation or maintenance of the malignant phenotype. 

Finnish Brctst Cancer Group, the Foundation for the Development of Laboratory Med- To identify such transcripts, we applied a Combination of cDNA and 
icinc the Medical Research Fund of 0« Tampere Uni^rsity Hospital, the Foundation for CGH microarrays to: (fl) determine the global impact that gene CCW 
Commercial and Technical Science* and the Swcdnh Research Council. . . . , V , s 7* , v , ° . rj 

^slipiemmtary data for thii article are available at Cancer Research Online (httpj? number variation plays in breast cancer development and progression; 
caiiceiTe8JacTjournaii.org). and (b) identify and characterize those genes whose mRNA expres- 
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Expression ratio 

Fig. 1. Impact of gene copy number on global gene cxprctxion level*, A. percentage of 
over- and uncterenpresscd genes (7 axis) according to copy number ratios {X axis). 
Threshold values used for over- and unocrexpresaion were >2.184 (global upper 7% of 
the cDNA ratios) and < 0.4826 (global lower 7% of the expression ratios). B. percentage 
of amplified and deleted genes according to expression ratios. Threshold values for 
amplification and deletion were >\J5 and <0.7. 
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FhL 2. Gcnoioc^kk copy rromber and expression analysis in the MCF-7 breast cancer ceil Mae. A, chromosomal CGH analysis of MCF-7. Tho copy numbs ntb profile (Mm 
across the aHait cenome from lp telomere to Xq telomere is shown along with ±1 SD {orang* lints). The black horizontal lint indicates a ratio of 1 .0; red hne, a ratio of 0.8; 
and ereen lint, a milo of XX B-C genome-wlda copy number analysis in MCF-7 by CGH on cDNA nucroarray. The copy number ratios were plotted as a function of the position 
or the cDNA clones along the human genome. In A, individual data point! are connected with a line, and a moving media of 1 0 adjacent dona if shown. Red horizontal One, the 
casv number ratio of I 0. In C individual dam points are labeled by color coding according to cON A expression ratios. The bright red dots indicate toe upper 2%, and dark red dot* 
Swxt 5* of the expression ratios in MCF-7 cells (ovetexpressed genes); bright green dots indicate the lowest 2%, and dark green dots, the next 5* of the expression ratios 
(underexpresscd genes); the rest of the observations are shown wito We^ 
indicated with a dashed line. 



sion is most significantly associated with amplification of the corre- 
sponding genomic template. 

s 

MATERIALS AND tyETHODS 

Breast Cancer Cell Lines. Fourteen breast cancer cell lines (BT-20, BT- 
474, HCC1428, Hs578t, MCF7, MDA-361. MDA-436, MDA-453. MDA-468, 
SKBR-3, T-47D, UACC812, ZR-75-1, and ZR-75-30) were obtained from the 
American Type Culture Collection (Manassas, VA). Cells were grown under 
recommended culture conditions. Genomic DMA and mRNA were isolated 
using standard protocols. 

Copy Number and Expression Analyses by cDNA Mkro arrays. The 
preparation and printing of the 13,824 cDNA clones on glass slides were 
performed as described (1 1-13). Of these clones, 244 represented uncharac* 
terized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microarrays were done as described (14, 
15). Briefly, 20 /tg of genomic DNA from breast cancer cell lines and normal 
human WBCs were digested for 14-18 h with AM and Rsal (Life Technol- 
ogies, Inc., Rock vi lie, MD) and purified by phenol/chloroform extraction. Six 
jig of digested cell One DNAs were labeled with Cy3-dUTP (Amersham 
Pharmacia) and normal DNA with CyS-dUTP (Amersham Pharmacia) using 
the Bioprirae Labeling kit (Life Technologies, Inc.). Hybridization (14, 15) and 
posmybridizatioo washes (13) were done as described For the expression 
analyses, a standard reference (Universal Human Reference RNA; Stratagene, 
La Jolla, CA) was used in all experiments. Forty yt% of reference RNA were 
labeled with Cy3-dUTP and 3.5 tig of test mRNA with Cy5-dUTr\ and the 
labeled cDNAs were hybridized on microarrays as described ( 1 3, 1 5). For both 
microarray analyses, a laser coafocsl scanner (Agilent Technologies, Palo 
Alto, CA) was used to measure the fluorescence intensities at the target 
locations using the DE ARRAY software (16). After background subtraction, 
average intensities at each clone in the test hybridization were divided by the 
average intensity of the corresponding clone in the control hybridization For 
the copy number analysts, the ratios were normalized on the basis of the 
distribution of ratios of all targets on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 
array. Low quality measurements (i.e., copy number data with mean reference 
intensity <100 fluorescent units, and expression data with both test and 
reference intensity <100 fluorescent units and/or with spot size <50 units) 



were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to define outpoints for increased/ 
decreased copy number. Genes with CGH ratio > 1 .43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0.73 (representing the lower 5%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data. To evaluate 
the influence of copy number alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
cell line. Furthermore, cDNA ratios tor each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were repres ented by a vector 
that was labeled 1 for amplification (ratio, > 1 .43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signal-to-noise 
statistics (1). We calculated a weight, w r for each gene aa follow* 

™$i ~ "go 

where m gU <r gl and o^ denote the means and SDs for the expression 
levels for amplified and nonamphfied cell lines, respectively. To assess the 
statistical significance of each weight, we performed 10,000 random permu- 
tations of the label vector. The probability that a gene had a larger or equal 
weight by random permutation than the original weight was denoted by a. A 
low a (<0.05) indicates a strong association between gene expression and 
amplification. 

Genomic Localization of cDNA Clones and Ampllcon Mapping. Each 
cDNA clone on the microarray was assigned to a Unigcne cluster using the 
Unigene Build 141.* A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
versity of California Santa Cruz's GokJenPath database. 7 The chromosome and 
bp positions for each cDNA clone were then retrieved by relating these data 
sets. Amplicons were defined as a CGH copy number ratio >2.0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2.0 in at least three 
adjacent clones in a single cell line. The ampltcon start and end positions were 



* Internet address: h ttp -J I research nhgri . n ih. go v/nu croanay /downlcidable_cdna. html. 
7 Internet address: www.gcnc4ne.ucsc.edu. 
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Table! Summary of independent ampHcon* in 14 breast cancer celt totes by 

CGH microarray 

Location 

io21 
Iq22 
3p!4 

7pl2.l-7plU 
7q31 
7o32 

8e21.U-8o21.l3 
8o2U 

8q23.Mq24.l4 
8q2432 
9pl3 

I3q(22-e31 
16422 
!7q!l 

l7ql2-q2U 

-l7<(2U2-o2lJ3 

17022-0233 
I7q23.3-q243 
I9ql3 
20ql132 
20ql3.12 
20ql3.t2-43.lJ 
20ql3.2-Ql3.32 



extended to include neighboring nonarnplified clones (ratio, <1 J). The am- 
plicon size determination was partially dependent on local clone density. 

FISH, Dual-color interphase FISH to breast cancer cell lines was done as 
described (17). Bacterial artificial chromosome clone RP1I-361K8 was la- 
beled with SpectmmOrange (Vyiia, Downers Grove, \L\ and Spectrum- 
Orange-labeled probe for ECFJt was obtained from Vysis. SpectrumGreen- 
Ubeled chromosome 7 and 17 centromere probes (Vysu) were used as a 
reference. A tissue mtcroarray containing 612 formalin-fixed, pararTu>embeoV 
ded primary breast cancers (17) was applied in FISH analyses as described 
( 1 8). The use of these specimens was approved by the Ethics Committee of the 
University of Basel and by the NM. Specimens containing s 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 10% of the tumor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the log-rank test 

RT-PCR. The HOXB7 expression level was determined relative to 
GAPDH. Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Promegt Corp., Madison, Wl) with 10 ng of mRNA 
as a template. HOXB7 primers were 5 '-G AGC AGAGGG ACTCGGACTT-3 ' 
and 5'-GCOTCAGOTAGCGATTGTAG-3'. 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13,824 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH raicroarrays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (/.*, belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig. 1 A). Conversely, 10.5% 
of the transcripts with high-level expression (cDNA ratio, >10) 
showed increased copy number (Fig. \B). Low-level copy number 
increases and decreases were also associated with similar, although 
less dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer Amplicons. Base-pair 
locations obtained for 1 1,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A). The average spacing of clones throughout the genome 
was 267 kb. This high-resolution mapping identified 24 independent 
breast cancer amplicons, spanning from 0.2 to 12 Mb of ON A (Table 
1). Several amplification sites detected previously by chromosomal 



CGH were validated, with lq21, 17ql2^-q21.2 t 17q22-q23,20ql3.1, 
and 20ql3.2 regions being most commonly amplified. Furthermore, 
the boundaries of these amplicons wecv precisely delineated. In ad- 
dition, novel amplicons were identified at 9pl 3 (38.65-39.25 Mb), 
and I7q21 J (52.47-55.80 Mb). 

Direct Identification of Putative Amplification Target Genes, 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression <iata on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lp!3, 17q22-q23, and 20ql3 were highly overex- 
pressed. A view of chromosome 7 in the MOA-468 cell line 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl l-p!2 (Fig. 14). In BT-4 74, the two known amplicons 
at 17ql2 and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 35). In addition* several genes, including the 
homeobox genes HOXB2 and HOXB 7, were highly amplified in a 
previously undescribed independent amplicon at 17q21.3. HOXB7 
was systematically amplified (as vali dated by FISH, Fig. 34 inset) 
as well as overexpressed (as verified by RT-PCR, data not shown) 
in BT-474, UACC812, and ZR-75-30 ceils. Furthermore, this novel 
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Fig. 3. Annexation of gene expression data on CGH microarray profiles. A, genes in the 
?p II -p) 2 tmp (icon in the MDA-468 cell line are highly expressed (red dots) and include 
the EGFR oncogene B. several genes in the I7ql2, 17q2IJ, and I7q23 arapiicons in the 
BT-474 breast cancer cell line ire highly overexpressed (ned) and include the HOXB7 
gene. The data labels aod color coding are as indicated for Fig. 2C Insets show 
chromosomal CGH profiles for the corresponding chromosomes and validation of the 
increased copy number by Interphase FISH using EGFR (red) and chromosome 7 
centromere probe (green) to MDA-468 (A) and HOXB7-$pexMc probe {red) and chro- 
mosome 17 centromere (green) to BT-474 cells (B). 
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amplification was validated to be present in 10.2% of 363 primary 
breast cancers by FISH to a tissue microarray and was associated 
with poor prognosis of the patients (P - 0.001). 

Statistical Identification and Characterization of 270 Highly 
Expressed Genes in Amplicons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell lines (Fig. 4, Supplemental Fig. B). Accord- 
ing to the gene ontology data, 1 91 of the 270 genes represented 
hypothetical proteins or genes with no functional annotation, whereas 
179 had associated functional information available. Of these, 15 J 
(84%) are implicated in apoptosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that could not be directly linked with cancer. 



DISCUSSION 

The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in > 1000 publications applying CGH 9 (9, 10), as well 
as in a large number of other molecular cytogenetic cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely unknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (15, 19-21). Here, we applied genome- 
wide cDNA microarrays to identify transcripts whose expression 
changes were attributable to underlying gene copy number alterations 
in breast cancer. 

The overall impact of copy number on gene expression patterns was 
substantial with the most dramatic effects seen in the case of high- 



• Internet address: htm^www.gfneonU)logy.org/. 
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level copy number increase. Low-level copy number gains and losses 
also had a significant influence on expression levels of genes in the 
reaions affected, but these effects were more subtle on a gene-by-gene 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more important than that of 
high-level amplifications. Aneuploidy and low-level gams and losses 
of chromosomal arms represent the most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on marry genes. Our result, in breast cancer extend the recent 
studies on me impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 

model system (22-24). ■ ' 

The COH microarray analysis identified 24 independent breast 
cancer amplicons. We defined the precise boundaries for many am- 
plkons detected previously by chromosomal COH (9, 10, 25, 26) and 
also discovered novel amplicons mat had not been detected previ- 
ouslv presumably because of their small size (only 1-2 Mb) or close 
proximity to other larger amplicons. One of these novel amplicons 
involved the bomeobox gene region at !7q2U and led to the over- 
expression of the HOXB7 and H0XB2 genes. The homeodomain 
transcription factors are known to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). HOXB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
turaorigenicity and angiogenesis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexposing HOXB7 in breast cancer and suggest that 
HOXB7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HOXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poorprognosis 
of the patients. 

We carried out a systematic search to identify genes whose 
expression levels across all 14 cell lines were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing -2% of all genes on the array), including not only 
previously described amplified genes, such as NER-2, MYC t 
EGFR ribosomal protein s6 kinase, and A/B3, but also numerous 
novel genes such as NRAS-related gene (lp!3), syndecan-2 (8q22), 
and bone morphogenic protein (20ql3.1), whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biologicaHnsights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we demonstrate application of cDNA raicroarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
prominent global influence of copy number changes on gene 
expression levels; (b) a high-resolution map of 24 independent 
amplicons in breast cancer, and (c) identification of a set of 270 
genes the overexpression of which was statistically attributable to 
gene amplification. Characterization of a novel amplicon at 
I7q21.3 implicated amplification and overexpression of the 
HOXB7 gene in breast cancer, including a clinical association 
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between HOXB7 amplification and poor patient prognosis. Overall, 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development 
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Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
ization (array CGH) analysis of DNA copy number variation In 
a series of primary human breast tumors. We have profiled DNA 
copy number alteration across 6,691 mapped human genes, in 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and deletion corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of ampllcon boundaries and 
the quantitative analysis of ampHcon shape provide significant 
improvement In the localization of candidate oncogenes. Parallel 
microarray measurements of mRNA levels reveal the remarkable 
degree to which variation In gene copy number contributes to 
variation in gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change in DNA copy number is associated with a corre- 
sponding 1.5-fold change In mRNA levels, and that overall, at least 
12% of all the variation In gene expression among the breast 
tumors Is directly attributable to underlying variation in gene copy 
number. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which may contribute to the development or 
progression of cancer, 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy number 
alteration in breast cancer cell lines and tumors (2-4). While 
some of these regions contain known or candidate oncogenes 
[e.g., FGFR1 (8pll), MYC (8q24), CCND1 (Hql3), ERBB2 
(17q12), and ZNF217 (20ql3)j and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g., gain of lq, 8q22, and 17q22-24, and loss of 
8p) remain to be identified. A high-resolution genome-wide 
map, delineating the boundaries of DNA copy number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes in breast 
cancer. In this study, we have created such a map, using 
array-based CGH (5-7) to profile DNA copy number alteration 
in a series of breast cancer cell lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identified 
in breast tumors alter expression of genes within involved 
regions. Because we had measured mRNA levels in parallel in 
the same samples (8), using the same DNA microarrays, we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 
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this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. 

Materials and Methods 

Tumors and Cell lines. Primary breast tumors were predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
mas, with more than 50% being lymph node positive. The 
fraction of tumor cells within specimens averaged at least 50%. 
Details of mdhridua) tumors have been published (8, 9), and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cancer 
cell lines were obtained from the American Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA Labeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et al (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 microliters and the 
volumes of all reagents were adjusted accordingly. "Test 1 * DNA 
(from tumors and ceil lines) was f hiprescently labeled (Cy5) and 
hybridized to a human cDN A microarray containing 6,691 
different mapped human genes (i.e., UniGene clusters). The 
"reference" (labeled with Cy3) for each hybridization was nor- 
mal female leukocyte DNA from a single donor. The fabrication 
of cDNA microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a GenePix scanner (Axon Instruments, Foster City, CA), and 
fluorescence ratios (test/reference) calculated using scan alyze 
software (available at http://rana.lbl.gov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for all array elements equal to 0. Measure- 
ments with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
that deviated significantly from background ratios measured in 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estimating 
Significance of Altered Fluorescence Ratios in the supporting 
information). When indicated, DNA copy number profiles are 
displayed as a moving average (symmetric 5-nearest neighbors). 
Map positions for arrayed human cDNAs were assigned by 
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Ho 1. Genome-wide measurement of DNA copy number alteration by array CGH. (a) DNA copy number profiles are illustrated for call Dries containing different 
numbers of X chromosome* for breast cancer cell lines, and for breast tumors, Each row representi a different cell line or tumor, and each column represents 
one of 6 691 different mapped human genes present on the mkroarray, ordered by genome map position from 1 pter through Xqter. Moving average (symmetric 
5-nearest neighbors) fluorescence ratios (test/reference) are depicted using a log r based pseudocolor scale (indicated), such that red luminescence reflects 
fold-amplification, green luminescence reflects fold-deletion, and black indicates no change (gray Indicates poorly measured data), (b) Enlarged view of DNA 
copy number profiles across the X chromosome, shown for ceil lines containing different numbers of X chromosomes. 



identifying the starting position of the best and longest match of 
any DNA sequence represented in the corresponding UniGcne 
cluster (10) against the "Golden Path" genome assembly 
(http://genome.ucsc.edu/; Oct 7, 2000 Freeze). For UniGcne 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for all elements representing the same UniGene 
cluster) are reported. For mRNA measurements, fluorescence 
ratios are "mean-centered" (i.e„ reported relative to the mean 
ratio across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information. 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA microarrays containing 6,691 different mapped human 
genes (Fig. la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the 6,691 cDNAs according to the "Golden 
Path" (http://genome.ucsc.edu/) genome assembly of the draft 
human genome sequences (11). In so doing, arrayed cDNAs not 
only themselves represent genes of potential interest (e.g., 
candidate oncogenes within amplicons), but also provide precise 
genetic landmarks for chromosomal regions of amplification and 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. 16), as we did before 
(7), demonstrated the sensitivity of our method to detect single- 
copy loss (45, XO), and 15- (47,XXX), 2- (4&\XXXX), or 
25-fold (49,XXXXX) gains (also see Fig. 5, which is published 
as supporting information on the PNAS web site). Fluorescence 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer eel] lines and primary tumors 
(Fig. la), detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For example, gains within lq, 8q, I7q, and 20q were 
observed in a high proportion of breast cancer cell lines/tumors 
(90%/69% t 100%/47%, 100%/60%, and 90%/44%, respective- 
ly), as were losses within lp, 3p, 8p, and 13q (80$>/24%, 
80%/22%, 8Q%/22%, and 70%/18%, respectively), consistent 
with published cytogenetic studies (rets. 2-4; a complete listing 
of gains/losses is provided in Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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of X chromosomes, for breast cancer cell lines, and for breast tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering to 
highlight recurrent copy number changes. The 241 genes present on the mkroarrays and mapping to chromosome a are ordered by position along the 
chromosome. Fluorescence ratios (test/reference) are depicted by a log, pseudocolor scale (indicated). Selected genes are indicated with color-coded text (red. 
Increased; green, decreased; black, no change; gray, not well measured) to reflect Correspondingly altered mRNA levels (observed in the majority of the subset 
of samples displaying the DNA copy number change). The map positions for genes of interest that are not represented on the microarray are indicated in the 
row above those genes represented on the ^rraif. (b) Graphical display of DNA copy number profile for breast cancer cell line SK8R3. Fluorescence ratios 
(tumor/normal) are plotted on a log? scale for chromosome 8 genes, ordered along the chromosome. 



number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade (P « 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative (P ■ 0,04), and harboring TP53 mutations (P « 
0.0006) (see Table 4, which is published as supporting informa- 
tion on the PNAS web site). 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in pur series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig. 2a). The complexity of amplicon 
structure is most easily appreciated in the breast cancer cell line 
SKBR3. Although a conventional CGH analysis of 8q in SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. 2b). For each of these regions we can define the 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). 

For a subset of breast cancer cell lines and tumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays (8). The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an ampli- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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«g. 3. Concordance between DNA copy number and gene expression across chromosome 1 7. DNA copy number alteration (Upper) and mRNA levels (tower) 
are illustrated for breast cancer ceil fines and tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering (Upper) and the 
identical sample order is maintained (Lower). The 354 genes present on the microarrays and mapping to chromosome 1 7. and for which both DNA copy number 
and mRNA levels were determined, are ordered by position along the chromosome; selected genes are indicated In color-coded text (see Hg. 2 legend) 
Fluorescence ratios (test/reference) are depicted by separate tog 2 pseudocolor scales (indicated). 



of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression are quite concordant; i.e., a significant 
fraction of highly amplified genes appear to be correspondingly 
highly expressed. The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4 f and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4a). For both the 



breast cancer cell lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion (P values for pair-wise Student's 
t tests comparing adjacent classes: cell lines, 4 X 10" 49 , 1 X 10" 49 , 
5 X l<r 5 , 1 X 10-*; tumors, 1 X 10-* 1 X 10~ 214 , 5 X lO^ 1 , 
1 x 10~ 4 ). A linear regression of the average log(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a 2-fold change in DNA copy 
number was accompanied by 1.4- and 13-fold changes in mRNA 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4a, regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor samples (Fig. 4b). 
The distribution of correlations forms a normal-shaped curve, 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statistically significant, as evidenced in a plot 
of observed vs. expected correlations (Fig. 4c), and reflects a 
pervasive global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig. 4b) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of ail variation measured k mRNA levels among the 37 
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tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, overall, about 
7% of all of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
Ecnes (Fig Ad). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data roost reliably measured (fluorescence intensity/ 
background >3); using data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
copy number increases to 12% (Fig. 4rf). This still undoubtedly 
represents a significant underestimate, as the observed vanation 
in global gene expression is affected not only by true variation in 
the expression programs of the tumor cells themselves, but also 
by the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amphcon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring araplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
lyzing mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 



cell lines and tumors. Although the DNArnicroarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generalizable 
(but would nevertheless still be remarkable if only applicable to 
this set of -6,100 genes). 

In budding yeast, aneuploidy has been shown to result in 
chromosome-wide gene expression biases (13). Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In 
agreement with our findings, Phillips et aL (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et al. (15) recently reported that in metastatic 
colon tumors only -4% of genes within amplified regions were 
found more highly (>2-fbld) expressed, when compared with 
normal colonic epithelium. This report differs substantially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-fold increased expression. These contrasting 
findings may reflect methodological differences between the 
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the genomic distribution of expressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DN A copy number aberration, particularly aneuploidy (where 
gene expression can be averaged across large chromosomal 
regions; see Fig. 3 and supporting information). Fifth, this 
finding implies that a substantial portion of the phenotypic 
uniqueness (and by extension, the heterogeneity in clinical 
behavior) among patients' tumors may be traceable to underly- 
ing variation in DNA copy number. Sixth, this rinding supports 
a possible role for widespread DNA copy number alteration m 
turnorigencsis (17, 18), beyond the amplification of specific 
oncogenes and deletion of specific tumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread imbalance in gene expression, might disrupt critical 
stochioraetric relationships in cell metabolism and physiology 
(eg., proteosome, mitotic spindle), possibly promoting further 
cfiroTh%son^^ to tumor 

development or progression. Finally, our finding* i suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 
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Each year, over 182,000 women in the United States are 
diagnosed with breast cancer, and approximately 45,000 die 
of the disease. 1 Incidence appears to be increasing in the 
United States at a rate of roughly 2% per year. The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role. 2 

Five-year survival rates range from approximately 65%- 
85%, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, early intervention, and 
treatment. 

Prognostic markers currently used in breast cancer recur- 
rence prediction include tumor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor receptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

HER-2/neu (also known as c-erbB2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease-free 
survival. The gene has been shown to be amplified and/or 
overexpressed in 10%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma. 3 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: immunohistochemistry 
(1HC, HercepTest™) and FISH (fluorescent in situ hybridiza- 
tion, PathVysion™ Kit). Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amplifica- 
tidh. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.7% for 
patients with no HER-2/neu gene amplification. 4 HEK-2/neu 
status may be particularly important to establish in women with 
small (<1 cm) tumor size. 

The choice of methodology for determination of HER-2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test, was granted based on clinical trials 
involving 1549 node-positive patients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin- 
based therapy, while those with normal HER-2/neu levels did 
not. The study therefore identified a sub-set of women, who 
because they did not benefit from more aggressive treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that HER-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at 
any time and disease-related death. 5 Demonstration of HER- 
2/neu gene amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin 0 (Trastuzumab) mono- 
clonal antibody therapy, however, is based upon demonstra- 
tion of HER-2/neu protein overexpression using HercepTest™. 
Studies using Herceptin 0 in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin 0 in patients with or without gene amplification 
status determined by FISH are in progress. 

In general, FISH and IHC results correlate well. However, 
subsets of tumors are found which show discordant results; 
i.e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
cal significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMC/PAML will uti- 
lize immunohistochemistry (HercepTest 0 ) as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 
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CPT code information 

HER-2/neu via IHC 

88342 (including interpretive report) 

HER-2/neu via FISH 

88271 *2 Molecular cytogenetics, DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybrid- 
ization, analyze 25-99 cells 
88291 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 

Procedural Information 

Immunohistochemistry is performed using the FDA-approved 
DAKO antibody kit, Herceptesr 0 . The DAKO kit contains 
reagents required to complete a two-step immunohisto- 
chemical staining procedure forxoutinely processed, paraffin- 
embedded specimens^ Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules linked to a 
common dextran polymer backbone, thus eliminating the need 
for sequential application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion ©f the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site. The specimen is then coun- 
terstained; a pathologist using light-microscopy interprets 
results. 

FISH analysis at SHMC/PAML is performed using the 
FDA-approved PathVysion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis, Inc. Formalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit con- 
tains two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 1 7, spectrum orange) present at 
the chromosome 17 centromere and the second for the HER- 
2/neu oncogene located at 17ql 1 .2-12 (spectrum green). Enu- 
meration of the probes allows a ratio of the number of copies 
of chromosome 17 to the number of copies of HER-2/neu to 
be obtained; this enables quantification of low versus high 
amplification levels, and allows an estimate of the percentage 
of cells with HER-2/neu gene amplification. The clinically 
relevant distinction is whether the gene amplification is due 
to increased gene copy number on the two chromosome 17 
homologues normally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
result, ratios of 2. 1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported from the Vysis-certified Cytogenet- 
ics laboratory at SHMC. 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, WISP-1 and WISPS, that are up-regulated in the 
mouse mammary epithelial cell line C57MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
WISPS, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (i) C57MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt-1 under the control of a 
tetracyline repressible promoter, and (ii) Wnt-1 transgenic 
mice. The WISP-1 gene was localized to human chromosome 
8q24.1-8q24.3. WISP-1 genomic DNA was amplified in colon 
cancer cell lines and in human colon tumors and its RNA 
overexpressed (2- to >30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISPS 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to > 40-fold) in 63% of the colon tumors analyzed. 
In contrast, WISPS mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2- to >30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitutively active glycogen 
synthase kinase-3/3 (GSK-30) resulting in an increase in 
j3-catenin levels. Stabilized /3-catenin interacts with the tran- 
scription factor TCF/Lefl, forming a complex that appears in 
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the nucleus and binds TCF/Lefl target DNA elements to 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
/3-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
to j3-catenin, and facilitates its degradation. Mutations in 
either APC or j3-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account- for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3, a member of 
the transforming growth factor (TGF)-/3 superfamily, and the 
homeobox genes, engrailed, goosecoid, twin (Xtwn), and siamois 
(2). A recent report also identifies c-myc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and retractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed ceils, WISPS 
and WISPS, and a third related gene, WISPS. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF1 00777, 
AF1 00778, AF1 00779, AF100780, and AF1 00781). 
tTo whom reprint requests should be addressed, e-mail: diane@gene. 
com. 
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cDNA was synthesized from 2 jxg of poly(A) + RNA isolated 
from the C57MG/Wnt-1 cell line and driver cDNA from 2 fig 
of poly(A) + RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WISP-1 were isolated by screening a AgtlO mouse 
embryo cDNA library (CLONTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WISP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISPS were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PCR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 uM of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization. 33 P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP-2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCI cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2( Act ) where ACt represents the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
d-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The W/SP-specific signal was 
normalized to that of the glyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-1 and WISP-2 by SSH. To identify Wnt- 
1-inducible genes, we used the technique of SSH using the 



mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/ Wnt-1 cells. 

Two of the cDNAs, WISP-1 and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on j3-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/WnM cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNX (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP-1 were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of ^40,000 (M T 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. 2A). 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of 27,000 (M r 27 K) (Fig. 2B). Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 

C57MG 

Parent WnM Wm-4 ' 



A. 




Fig. 1. WISP-1 and WISP-2 are induced by WnM, but not WnM, 
expression in C57MG cells. Northern analysis of WISP-I (A) and 
WISP-2 (B) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells. Poly(A) + RNA (2 /ig) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse WISP- /-specific probe 
(amino acids 278-300) or a 190-bp WISP-2- specific probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human /3-actin probe. 
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Fig. 2. Encoded amino acid sequence alignment of mouse and 
human WISP-1 (A) and mouse and human WISP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-1. 

Identification of WISPS. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISPS cDN A of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354-aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-linked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WISP-1 and WISP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-1 and 32% identity with WISP-3 (Fig. 3/4). 

WISPs Are Homologous to the CTGF Family of Proteins. 
Human WISP-1, WISP-2, and WISPS are novel sequences; 
however, mouse WISP-1 is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-1 (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov. CTGF is a chemotactic and mitogenic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-/3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, similarity to Wnt-1. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 3B) (21). The N-terminal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (IGF)- 
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Fig. 3. Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WISP-2 that are not 
present in WISP-3 are indicated with a dot. (B) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and B). The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WlSP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISPS was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISP-3 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP-2. Expression of 
WlSP-1 and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 
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Fig. 4. C, £, and G) Representative hematoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power (A and B), 
expression of WISP-1 is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and D), and tumor cells are negative. 
Focal expression of WISP- 1, however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in E-H. At low 
power (E and F), expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 



the predominant cell type expressing WISP-1 was the stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISP-3 mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig- 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fo!d) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-1 and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors (P < 0.001 for each). The 
copy number for WISPS was indistinguishable from one {P = 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 (P < 0.001). 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 
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Fig. 5. Amplification of WISP-1 genomic DNA in colon cancer cell 
lines. (A) Amplification in cell line DNA was.determined by quanti- 
tative PCR. (B) Southern blots containing genomic DNA (10 /ig) 
digested with EcoRI (WISP-1) ot Xbal (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of WISP genes in human colon 
tumors. The relative gene copy number of the MSP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means ± SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to > 25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WlSP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP- J, WISP-3 RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 



100 | 



10i 



2 0.1 

0 10 

1 1 

2 o.i 
DC 

O 0.01 

CD 0.001 



1 

1 

- nil 


n 


ft n 


PI A nnOfl 


WiSP-1 

L 





3 

< 

CC 

e 



100 



10 



0.1 







WISP-2 






ipr w |* 












■ WISP-3 






j 1 1 1 i 1 1 # 









82 39 C* 67 IB 63 156 30 120 104 146 104 210 212 200 17 30 SMS 76 
M Bt 82 B2 B2 92 62 B2 B2 62 B2 B2 C1 C2 C2 C2 0 DO O 

Patient #/Dukes Stage 

Fig. 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 



mucosa. The amount of overexpression of WISP-3 ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WISP-1, WISP-2, and WISP-3, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., j3-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-1 -transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through /3-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and WISP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such as TGF-0, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin avft serves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-/31, which is the stimulus for 
stromal proliferation (34). TGF-/31 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Wnt-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WISP-1 and 
WISP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and WISP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP-1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
WISPS RNA was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WISP-2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-1, the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and /3-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic /3-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. 
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High-throughput technologies, such as proteomic screening and ONA micro-arrays, produce vast 
amounts of data requiring comprehensive analytical methods to decipher the biologically relevant 
results. One approach would be to manually search the biomedical literature; however, this would be 
an arduous task. We developed an automated literature-mining tool, termed MedGene. which 
comprehensively summaries and estimates the relative strengths of all human gene-disease 
relationships in Medline. Using MedGene, we analyzed a novel micro-array expression dataset 
comparing breast cancer and normal breast tissue in the context of existing knowledge. We found no 
correlation between the strength of the literature association and the magnitude of the difference in 
expression level when considering changes as high as 5*fold; however, a significant correlation was 
observed [r » 0.41; p « 0.05) among genes showing an expression difference of 10-fold or more. 
Interestingly, this only held true for estrogen receptor (ER) positive tumors, not ER negative. MedGene 
identified a set of relatively understudied, yet highly expressed genes in ER negative tumors worthy of 
further examination. 
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Introduction 

At its current pace, the accumulation of biomedical literature 
outpaces the ability of most researchers and clinicians to stay 
abreast of their own immediate fields, let alone cover a broader 
range of topics. For example, to follow a single disease, e.g.. 
breast cancer, a researcher would have had to scan 130 different 
journals and read 27 papers per day in 1999. 1 This problem is 
accentuated with high -throughput technologies such as DNA 
micro-arrays and proleomics, which require the analysis or 
large datasets involving thousands of genes, many of which are 
unfamiliar to a particular researcher. In any tnicroarray experi- 
ment, thousands of genes may demonstrate statistically sig- 
nificant expression changes, but only a fraction of these may 
be relevant to the study. The ability to interpret these datasets 
would be enhanced if they could be compared to a compre- 
hensive summary of what is known about all genes. Thus, there 
is a need to summarize existing knowledge in a format that 
allows for the rapid analysis of associations between genes and 
diseases or other specific biological concepts. 

One solution to this problem is to compile structured digital 
resources, such as the Breast Cancer Gene Database 1 and the 
Tumor Gene Database. 2 However, as these resources are hand- 
curated, the labor-intensive review process becomes a rate- 
limiting step In the growth of the database. As a result, these 
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databases have a limited scale and the genes are not selected 
in a systemadc fashion. 

An alternative approach is automated text mining: a method 
wliich Involves automated Information extraction by searching 
documents for text strings and analyzing their frequency and 
context. This approach has been used successfully in several 
instances for biological applications. In most cases, it has been 
applied to extract information about the relationships or 
Interactions that proteins or genes have with one another, In 
the literature or by functional annotation. 3 " 7 Thus far, few 
publication have applied text-mining to examine the global 
relationships between genes and diseases. Perez-lratxeta et a I. 
automatically examined the GO (Gene Ontology) annotation 
of genes and their predicted chromosomal locations in order 
to identify genes linked to inherited disorders. 8 

To obtain a more global understanding of disease develop- 
ment, it would be valuable to Incorporate Information regarding 
ail possible gene-disease relationships, including biochemical, 
physiological, pharmacological, epidemiological, as well as 
genetic. This Information would enable comprehensive com- 
parisons between large experimental datasets and existing 
knowledge in the literature. This would accomplish two things. 
First, it would serve to validate experiments by demonstrating 
that known responses occur as predicted. Second. It would 
rapidly highlight which genes are corroborated by the literature 
and which genes are novel In a given context. We have utilized 
a computational .approach to literature mining to produce a 
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comprehensive set of gene-disease relationships. In addition, 
we have developed a novel approach to assess the strength of 
each association based on the frequency of citation and co- 
tiiation. We applied this tool to help Interpret the data from a 
large micro-array gene expression experiment comparing 
normal and cancerous breast tissue. 



Methods 

MedGene Database. MedCene Is a relational database, stor- 
ing disease and gene information from NCBl. text mining re- 
sults, statistical scores, and hyperlinks to the primary lit- 
erature. MedGene has a web-based user interface for users to 
query the database {httpy/Wpseq.medJiarvard.edu/MedCeney). 

Text Mining Algorithms. MeSH files were downloaded from 
ihe MeSH web site atNLM (Nation Library of Medicine) (http:// 
www.n]m.nirLgov/mesh/rneshhome.htmI) and human disease 
categories were selected. Locusiink files were downloaded from 
the LocusLink web site at NCBl (http://www.ncbl.nih.gov/ 
LocusLink/). Official/preferred gene symbol, official/preferred 
gene name, pnd gene alternative symbols and names, all 
relevant annotations and URLs for eachtocusLlnk record, were 
collected. Gene search terms were used for literature searching 
and included all qualified gene names, gene symbols, and gene 
family terms. Primary gene keys, predominantly qualified gene 
family terms and gene official/preferred symbols, were used 
to index Medline records. If the official/preferred gene symbols 
did not meet the standards to be an index, then qualified gene 
official/preferred names were used. A local copy of Medline 
records (up to July, 2002) was pre-selected, 

A JAVA module examined the MeSH terms and then indexed 
each Medline record with the appropriate disease terms. A 
separate JAVA module was used to examine the titles and 
abstracts for gene search terms and then to index the gene- 
related Medline records with the relevant primary gene key(s). 

Statistical Methods. For every gene and disease pair, we 
counted records that were indexed for both gene and disease 
(douole positive hits), for disease only (disease single hits), for 
gene only (gene single hits), and for neither gene nor disease 
(double negative hits) to generate a 2 x 2 contingency table. 
On the basis of the contingency table-framework, we applied 
different statistical methods to estimate the strength of gene- 
disease relationships and evaluated the results. These methods 
included chi-square analysis. Fisher's exact probabilities, rela- 
tive risk of gene, and relative risk of disease 16 (http:// 
hipseq.med.harvard.edu/MedfJene/). In addition, we computed 
the "product of frequency", which is the product of the 
proportion of disease/gene double hits to disease single hits 
and the proportion of disease/gene double hits to gene single 
hits. To obtain a normal distribution, we transformed all the 
statistical scores using the natural logarithm. We selected the 
log of the product of frequency (LPF) to validate MedGene and 
to use for the analysis with the micro-array data. Spearman 
rank-correlation coefficients were used to assess the linear 
relationship between LPF and micro-array fold change in 
expression level. 

Global Analysis. Diseases with at least 50 related genes were 
selected for clustering analysis, and the LPF scores were 
normalized with total score for each disease. Hierarchical 
clustering was done with the "Cluster' software and the 
clustering result was visualized using TreeVicwer" (http:// 
rana.lbl.gov/EisenSoftware.htm). 
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Breast Tissue Micro-Arrays. Eighty-nine breast cancer 
samples (79% ER-posltJve) and 7 normal breast tissue samples 
were selected from the Harvard Breast SPORE frozen tissue 
repository and were representative of the spectrum of histo- 
logical types, grades, and hormone receptor immuno-pheno- 
types of breast cancer. Biotinylated cRNA. generated from the 
total RNA extracted from the bulk tumor, was hybridized to 
Affymetrtx U95A oligonucleotide micro-arrays. These micro- 
arrays consist of 1 2 400 probes, which represent approximately 
9000 genes. Raw expression values were obtained using CENE- 
CHIP software from Affymeuix. and then further analyzed using 
the DNA-Chip Analyzer (dChip) custom software. 

Results 

Automated Indexing of Medline Records by Disease and 
Gene. To study the gene-disease associations in the literature, 
we first compiled complete lists for human diseases and human 
genes. To index all Medline records that were relevant to 
human diseases, the Medical Subject Heading (MeSH) index 
of Medline records was utilized. MeSH Is a controlled medical 
vocabulary from the National Library of Medicine and consists 
of a set of terms or subject headings that are arranged in both 
an alphabetic and an hierarchical structure. Medline records 
are reviewed manually and MeSH terms axe added to each with 
software assistance. 916 Twenty-three human disease category 
headings along with all of their child terms (see the Supporting 
Information, Supplemental Table 1. or visit http://hipseq. 
med.harvard.edu/MedGene/publication/s^Table I. html) were 
selected from the 2002 MeSH index creating a list or 4033 
human diseases. 

No index comparable to the MeSH index exists for genes, 
and thus. It was necessary to apply a string search algorithm 
for gene names or symbols found in Medline text, A complete 
list of genes, gene names, gene symbols, and frequently used 
synonyms were collected from the LocusLink database at 
NCBI. ,ua which contains 53 259 independent records keyed 
by an official gene symbol or name (June 18 m , 2002). For the 
purposes of this study, no astinction was made between genes 
and their gene products. Authors often use the same name for 
both, differentiating the two only by the use of italics, If at all. 
For the Intended use of this study, this lack of disUnoion is 
unlikely to have a large effect and may in fact be beneficial. 

Initial attempts to search the literature using these lists 
revealed several sources of false positives and false negatives 
(Table I). False positives primarily arose when the searched 
term had other meanings, whereas false negatives arose from 
syntax discrepancies necessitating the development of fitters 
to reduce these errors. The syntax issues were readily handled 
by including alternate syntax forms in the search terms. The 
false positive cases, caused by duplicative and unrelated 
meanings for the terms, were more difficult to manage. Where 
possible, case sensitive string mapping reduced inappropriate 
citations. In many cases, however, this was not sufficient and 
the terms had to be eliminated entirely, thereby reducing the 
false positive rate but unavoidably under-representing some 
genes. 

For the purposes of data tracking, a primary gene key was 
selected to represent ail synonyms that correspond to each 
gene. Medline records were indexed with a primary gene key 
when any synonym for that key was found In the title or 
abstract. Case-insensitive string mapping was used for all 
searches except as noted above. No additional weight was 
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Table 1, Systematic Sources of False Positives and Falsa Negatives in Unfiltered Data* 
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source of error 



error type 



example 



filte; solution 



gene symbol/name positive 
Is not unique 



gene symbol is false positive 

unrelated abbreviation 

gene symbol/name false positive 

has language meaning 

nonstandard syntax false negative 

unofficial gene name/symbol false negative 

nonspeclfled gene name false negative 



MAG-myelin 

associated glycoprotein 
MA C-malignancy-assoclated 

protein 

iM-pallid.homologue (mouse), 
pallidin (also abbrev. for Pennsylvania) 

WttS-Wiskott-Mdrkh Syndrome 
(also the word "was") 

BAG- 1 instead of BAG1 

P53 Instead of7P53 

estrogen receptor instead of 
Estrogen receptor 1 



eliminate this term 

eliminate this term 

case-sensitive string search 

add dash term 

add all gene nicknames 

add family stem term 



- (n oreiiminarv studies, Medline was searched for cooccurrence of genes and diseases and the resulting output was equaled to Identify error sources that 
u, ^^^^STm^ Each error source » categorized by the type of error it causes false positives are suggested relationships that are not real and 

error, lngeneral, error rotes maximized sensitivity, even at the expense of specificity If needed. 

added for multiple occurrences of a teem or the co-occurrence 
of multiple synonyms for the same gene key. 

Medline records were searched with all qualified gene 
Identifiers, such as the oiUclal/preferred gene symbol, the 
official/preferred gene name, all gene nicknames and all syntax 
variants. In situations where there are several members of a 
gene family or splice variants, some authors prefer to use a 
shortened gene family name, e.g„ estrogen receptor instead of 
estrogen receptor i {ESRft, creating a source of false negatives. 
For this reason, gene family stem terms were created for all 
genes that have an alpha or numerical suffix (e.g., IL2RA, TGFfi* 
ESHh etc.) and then used to search the literature. The family 
stem terms were handled separately from the specific gene 
names so that it would be clear when linkages were made to 
the gene family versus a specific member in that family. 

To improve performance and accuracy, some pre-selection 
was applied to the records that were scanned. First, review 
articles were eliminated to avoid redundant treatment of 
citations. Second, non-English Journals were removed because 
the natural language filters were only relevant to English 
publications. Finally, journals unlikely to contain primary data 
about gene-disease relationships were also removed (e.g., Int. 
J, Health Educ, Bedside Nurse, and / Health Econ.). Together, 
these filters reduced the 12 198 221 Medline publications (July 
2002) by 37%. 

Ranking the Relative Strengths or Gene-Disease Associa- 
tions. In total, there were 618 708 gene-disease co-citations, 
in which 16% (8207) of all studied genes had been associated 
to a disease and 96% (3875) of aU diseases had been associated 
to at least one gene. To rank the relative strengths of gene 
disease relationships, we tested several different statistical 
methods and examined the results. With the exception of the 
relative risk estimates, the methods provided similar "results 
with respect to the rank order of the gene-disease association 
strengths. However, after comparing the results to other 
databases and alter consulting disease experts, the log of the 
product of frequency (LPF) was selected for further analysis 
because it gave the best results overall. 

Validation of MedGene. In developing this tool, it was 
important to minimize the number of missed genes (false 
negatives) and miscalled genes (false positives). However. In 
situations when these goals were in conflict, inclusiveness was 
prioritized. To determine the false negative rate in MedGene. 
breast cancer was used as a test case because It was associated 
with more genes than any other human disease and because 




Figure 1. Estimation of the false negative rate by comparison 
with hand-curated databases. The breast cancer-related genes 
identified by MedGene were compared with those listed in 
several other databases including the Tumor Gene Database 
(TGD), 2 the Breast Cancer Gene 0at8base<BCG), , GeneCards 
(GC)" and Swissprot. 1 * Genes were considered false negatives 
<f they were represented in at least one of these other databases 
and not In MedGene and their link to breast cancer was sup- 
ported by at least one literature reference. All literature references 
were verified by manual review to confirm their validity. The 
number of genes in each database or shared by more than one 
database is indicated. The false negative rate was calculated by 
genes missed at MedGene (26)/totel number of nonoverlapping 
genes in other databases (285). 

there were several public databases that link genes to breast 
cancer. We compared the list of breast cancer-related genes 
from MedGene to these databases, illustrated in Figure 1, 
Among the 285 distinct breast cancer-related genes that wore 
supported by at least one literature citation in these hand- 
curated databases, 26 were absent from MedGene. suggesting 
a false negative rate of approximately 9%. To determine why 
these were missed, all literature references for these genes (80 
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papers) were reviewed manually (see ihe Supporting Informa- 
tion Supplemental Table 2. or visit http://hlpseo,med. 
harv"ard.edu/MedCene;pubhcation/s^able 2,html). Among 
these papers, most false negatives were caused by nonstandard 
R ene tenns or gene terms eliminated by our specificity niters. 
Few gene* were missed because they were only mentioned in 
review papers (0.4%) or ihey appeared only in the body of the 
manuscript but not the abstract or title (1.1%). Of note, 
MedCene identified approximately 2000 additional breast 
cancer-related genes not listed in any other database. 

To assess the false positive error rate, two complementary 
approaches were used: a detailed analysis of one disease and 
a global examination of 1000 diseases. The detailed approach 
examined the false positive error rate and Its sources, whereas 
the global approach tested whether the overall results made 
biomedical sense. 

Using the LPF, 1467 genes related to prostate cancer were 
assembled in rank order. We then retrieved approximately 300 
Medline records each Tor the highest ranked 100 and the lowest 
ranked 200 genes and manually reviewed the titles and 
abstracts to determine the verity of the association. Nearly 80% 
of the highest ranked 100 genes, fell into one of the five 
categories that reflect meaningful gene-disease relationships 
(see the Supporting Information, Supplemental Table 3, or visit 
http://hlpseq.med.harvard.edu/MedCene/publtcatlon/ 

s Table 3.html), Among the lowest ranked, 200 genes, ap- 
proximately 70% reflected true relationships. Of the 600 records 
reviewed there were only two In which the association between 
the gene and the disease was described as negative. Both were 
genes with very low scores. In both cases, the authors did not 
argue the absence of any relationship, but rather that a 
particular feature of the gene or protein was not shown to be 
related to human prostate cancer. ,1H 

The coincidence of some gene symbols with medical ab- 
breviations, chemical abbreviations and biological abbrevia- 
tions resulted In most of the false positives (see the Supporting 
Information, Supplemental Table 4, or visit http://hlpse- 
q rned.harvard.edu/MedCene/pubncation/s^Tabte 4.html). env 
* phasizing the Importance of the filters that were added in the 
search algorithm (Table 1) . Without the filters, the false positive 
rate more than doubled, and the false negative rate rose 
dramatically (data not shown). For example, among the papers 
about breast cancer* there were only 12 Medline records that 
referred to ESR! and 10 to ESRZ. whereas almost 2000 papers 
mentioned estrogen receptor without specifying ESRl or BSR2, 
this latter group was detected by the family stem term filter. 

To further validate these results, a global analysis of the gene- 
disease relationships described by MedCene was performed. 
For this experiment, it was reasoned that the more closely 
related the diseases are to one another, the mare they will be 
related to the same gene sets. Thus, if the relationships defined 
by MedGene accurately reflected the literature, then an unsu- 
pervised hierarchical clustering of the gene data should group 
diseases in a manner consistent with common medical think- 
ing Conversely, if the clustered diseases do not make sense 
biologically or medically, it may reflect excessive false positives, 
false negatives, or inappropriate scoring of the dala. 

To execute this experiment, the gene sets and the corre- 
sponding LPF values for 1000 randomly selected diseases (each 
with at least 50 gene relationships) were used as a dataset for 
clustering the diseases. A review of the results showed that the 
resulting disease clusters were indeed logical based upon 
common medical knowledge (see the Supporting Information. 

408 Journal of Proteose Research • Vol. I No. 4. 2003 



Hu et al. 

Supplemental Figure 1. ot visit h^://1ur^q.med.hafvard.edu/ 
MedCene/publication/s .Figure l.html). For example In one 
such cluster shown In Figure 2, diabetes and its complications 
grouped together and were also closely linked to diseases 
associated with starvation states. 

The number of genes associated with a given disease can 
be estimated by adjusting the MedGene number up by the raise 
negative rate M>%) and down by the false positive rate (-26% 
on average). Using this, the average disease has 103.7 ± 45.3 
(mean ± s.d.) genes associated with It, although the range is 
quite broad with 2359 genes related to breast cancer, 2122 
genes related to lung cancer and no genes related to a number 
of diseases. 

Applying MedGene to the Analysis of Large DataseU. Access 
to a comprehensive summary of the genes linked to human 
diseases provided an opportunity to analyze data obtained from 
a high-throughput experiment. We compared the MedGene 
breast cancer gene list u>a gene expression data set generated 
from a micro-array analysis comparing breast cancer and 
normal breast tissue samples. Micro-array analysis identified 
2286 genes that had greater than a 1-fold difference In mean 
expression level between breast cancer samples and normal 
breast samples. Using MedGene, we sorted the 2286 genes Into 
four classes: 555 genes directly linked to breast cancer In the 
literature by gene term search (first-degree association by gene 
name): 328 genes directly linked by family term search (first- 
degree association by family term); 1021 genes linked to breast 
cancer only through other breast cancer genes (second-degree 
association); and 505 genes not previously associated with 
breast cancer. (See the Supporting Information. Supplemental 
Figure 2. or visit htq>://hlpseq.rned.harvard.edu/MedGene/ 
publication/s^Flgure 2.html.) Among the 505 previously un- 
related genes, 467 were either newly Identified genes or genes 
that had not previously been associated with any disease. 
Among the remaining 38 genes, 9 had been related to other 
cancers, specifically esophageal, colon, uterine, skin, and cervix. 

To determine whether the genes highlighted by the micro- 
array analysis were more likely to have been previously linked 
to hreast cancer In the literature; we created a two-dimensional 
plot of the fold change of expression level between breast 
cancer and normal tissue versus the literature score (LPF) 
(Figure 3A). There was a broad spread of expression changes 
among the genes directly linked to breast cancer ranging from 
less than I-fold change (68%) to over 40-fold (0.3%). Notably, 
ihe majority of genes with greater than 10-fold expression 
changes were linked to breast cancer by first-degree assoda- 

tlon - 

Among all 754 genes directly linked to breast cancer in the 
literature, there was no correlation between LPF and micro- 
array Told change (r - 0.018. p-vaM - 0 62). However, when 
we stratified the analysis based on the magnitude of the fold 
change, we observed an increasing trend in correlation (Figure 
3B) suggesting that genes with a more substantial change in 
expression level were more likely to have a stronger association 
in the literature. For genes that had 10-fold change or more in 
expression level, the correlation increased to 0.41 (p-value - 
0.05). 

When we evaluated the micro-array data separately for ER 
positive and ER negative tumors, the trend in correlation 
between fold change and literature score was highly dependent 
on estrogen receptor status. Interestingly, there was a similar 
trend In correlation for ER positive tumors, but no trend In 
correlation for ER negative tumors. 
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Figure 2. Global validation by clustering analysis. 2(A). The gene sets and the corresponding LPF values for 1000 diseases, each with 
at least 50 gene relationships, were used in an unsupervised clustering of the diseases based on the gene patterns associated with 
them A sample of the data is shown here. 2(B), One of the resulting clusters is shown that corresponds to blood sugar states. Diabetes 
terms (above the line) and starvation slates terms (under the line) clustered together. Within these groups, there is also clustering of 
diabetic small vessel complications, altered serum chemistries, nutritional disorders, etc .(Supplemental Figure 1: httpJ/hipscq.med. 
harvard.edu/lviedGene/puDlicaiion/s,Flgure Lhtml). 

Finally, to validate our findings, we computed similar cor- disease unreiated to breast cancer. As expected, we did not 
relations between the breast cancer expression data and observe an Increasing trend in correlation Tor hyperten- 
LPF scores generated by MedGenc for hypertension, a sion. 
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Figure X Relationship between literature spore and functional data for breast cancer. 3A. The data from an expression analysis of 
samples for breast tumors and normal breast tissue were analyied to indicate the fold difference of expression level between breast 
tumor and normal sample (cutoff > 3-fotd change). The fold changes were plotted against the literature score for the same gene set. 
Green dots represent first-degree association by gene search, blue dots represent first-degree association by family search and red 
dots represent no-associatiOn. Some well-studied genes, such as BRCA2 (pink circle), are not reflected by a substantial difference in 
expression level. Furthermore, the majority of genes that have no association with breast cancer in the literature had less than 10-fold 
expression changes (shaded area). 3B. The Spearman rank-correlation coefficients between literature score (LPF) and the fold change 
of expression level between tumor and normal breast samples (y-axis) in relation to the amount of fold change of expression level 
(x-axis). Gene rank lists were generated for breast cancer (blue) and hypertension (pink). Correlations were also computed between 
the breast cancer gene LPF scores and fold change expression data among estrogen receptor positive tumors only (light blue) and 
estrogen receptor negative tumors only (purple). 
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Table 2. Top 25 Genes Related to Selected Human Diseases* 



research articles 



breast neoplasms 

estrogen receptor 

PGR 

BKBB2 

BRCAI 

BRCA2 

ECFR 

CYP19 

TFFt 

PSEN2 

TP53 

CES3 

CEACAM5 . 

ERBB3 

cyclin 

C0X5A 

cathepsin 

ERBB4 

TRAM 

CCNDl 

EGF 

MVCt 

insulin-like 
BCL2 

mucin 
FGF3 



hypertension 


rU.iiMiitnl/l arthritic 


REN 


RA 


DBP 


TNFRSF10A 


LEP 


CRP 


AGT 


AS 


INS 


ESRl 


KalJlKfcin 


HLA-DRB1 


ACE 


DRl 


enuOuiciin 


inierleukin 


5/0046 


TNF 


BDK 


1L6 


D1ANPH 


collagen 


SARI 


JUA 


PIH 


ACR . 


CD59 


TNFRSF12 


ALB 


112 


CYPUB2 


CH13Li 


MAT2B 


IL8 


angiotensin ' 


interieukin 1 


receptor 




matrix 


ACTR2 


metaltoprotelnase 


NPPA 


interferon 


LVM 


CD68 


PBH 


IL4 


NPY, 


ILI7 


POMC 


MMP3 


neuropeptide 


S1L 



bipolar disorder 



atherosclerosis 



ERDAI 

SNAP29 

PFKL 

DRD2 

TRH 

IMPA2 

HTR3A 

DRD3 

REM 

KCNN3 

DRD4 
HTR2C 

REIN 1 

DBH 

MAOA 

COMT 

HTR2A 

SYNJi 

INPPI 

NED04L 

FRA13C 

transducer or 

BRBB2 

BAJAP3 

ATP1B3 
DRD5 



apoilpo protein 

APOE 

LDLR 

ELN 

ARGI 

APOB 

APOA1 

MSR1 

LPL 

PONl 

plasminogen 
activator inhibitor 
PLC 

vascular cell 

adhesion molecule 

ATOM 

VWF 

INS 

ARC2 

ABCAI 

OLRl 

collagen 

MCP 

lipoprotein 
APOA2 
Intercellular 
adhesion molecule 
RAB27A 



• MedCene resulto for the top 25 genes associated wilh breast neoplasms, hypertension, rbeumotoU arthritis, bipolar disorder, and atr^osclen^ resp^tlvely 
ranked ^fl^^^rwS all the papers a^lunglhegeiic and the disease is available M MedCene website f^t p: //hl P seq.mod.harvardedu/ 
MedCene/). 



Discussion » . 

The Human Cehome Project heralded a new era in biological 
research where the emphasis on understanding specific path- 
ways has expanded to global studies of genomic organization 
and biological systems. High- throughput technologies can 
provide novel insight into comprehensive biological function 
but also introduces new challenges. The utility of these 
technologies is limited to the ability to generate, analyze, and 
interpret large gene lists. MedCene. a relational database 
derived by mining the information in Medline, was created to 
address this need. MedCene users can query for a rank-ordered 
list of human gene-disease relationships fTable 2) for one or 
more diseases. Each entry is hyperlinked to the original papers 
supporting each association and to other relevant databases. 

MedCene is an innovative extension of previous text mining 
approaches. Perez-Iratxeta et a!, used the CO annotation and 
their chromosomal locations to predict genes that may con 
tribute to inherited disorders. 8 MedCene takes a broader view 
and Includes all diseases and all possible gene-disease relation- 
ships. Furthermore. MedGene utilizes co-citation to indicate a 
relationship rather than GO annotation, which is limited to the 
subset of genes that have CO annotation. Our approach is 
complementary to that taken by Chaussabel and Sher. who 
used the frequency of co-cited terms to cluster genes into a 
hierarchy of gene-gene relationships* 

A unique aspect of this tool is the ability to assess the relative 
strength of gene-disease relationships based on the frequency 
or both co-citation and single citation. This presupposes that 
most co-cltatlons describe a positive association, often referred 
to as publication bias 15 and is supported by our observations 



that negative associations are rare (Supplemental Table 3: 
http://hlpseq.med.harvard .edu/MedGene/puWication/ sta- 
ble 3.html). Of course, relationships established by frequency 
of co-citation do not necessarily represent a true biological link; 
however. It Is strong evidence to support a true relationship. 

Another important feature of MedGene is the implementa- 
tion of software filters that substantially reduced the error rate. 
We estimate that less than 10% of all associations were missed 
and at least 70% of even the weakest associations were real. 
For this study, all of the filters that we applied were general 
ones. e.g., expanding the list of all gene names to address the 
different syntax forms used by different Journals, eliminating 
gene names that correspond to common English words, etc. 
The majority of the remaining search term ambiguities were 
idiosyncratic and difficult to identify systematically without 
' causing a significant rise in false negatives. AltcmaUve ap- 
proaches, such as the examination of the nearest neighbor 
terms, need to be considered to further reduce the false positive 
rate. 

It is not uncommon to see expression changes in micro* 
array experiments as small as 2-fold reported In the literature. 
Even when these expression changes are statistically significant, 
it is not always clear If they are biologically meaningful. When 
comparing expression levels of disease to normal tissue, one 
expects an enrichment of known disease-related genes to 
appear in the altered expression group. MedGene provided a 
unique opportunity to test this notion in the context of existing 
knowledge on a novel breast cancer micro-array dataset. For 
genes displaying a 5-foki change or less in tumors compared 
to normal, there was no evidence of a correlation between 
altered gene expression and a known role In the disease. This 
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Table 3. Genes with Large Expression Changes in ER- but 
Not in<ER-f Breast Tumors 



gene symbol 



Cnange \w\t| 


fold change (ER-) 


1.0 


610.8 


1.2 


89.4 


1.2 


69.8 


1.9 


59.6 


1.0 


38.5 


' 2.6 


33.2 


1.0 


30.6 


4.0 


27.9 


3.8 


21.9 


4.7 


18.6 


1.0 


14.6 


1.6 


14.4 


-1.0 


13.5 


4.2 


13.0 


4.4 


12.9 


-1.2 


12.3 


2.9 


12.2 


1.0 


11.8 


4.0 . 


11.6 


-4.3 


11.1 


* 2.9 


10.9 


3.0 


10.2 


4.6 


102 


1.0 


10.0 


-1.3 


-10.4 


-1.1 


t10.8 


1.3 


-11.4 


-4.1 


-15.7 


1.1 


-16.2 


-4.6 


-22.3 


-1.1 


-36.8 


-2.8 


-51.5 


-1.4 


-64.9 


-1.0 


-83.1 


-1.6 


-85.9 


2.4 


-150.3 



KRTHB1 
BRS3 
DKK1 
ZIC1 
TUU 
KIAA0680 
CDKNS 
EBIZ 
GZMB 
$1X18 
GPR49 
MYOW 
LAD I 
P0LE2 
HMC4 
BCL2LI1 
LRP8 
CCNB2 
CCNB2 
FOB 
KNSL6 
HIF5 
SERP1NH2 
YAP1 
LPNB 
TCEA2 
TFFl 
COLI7A1 
POPS 
BPAC1 
PDZK1 
VECFC 
MUC6 
SBRPINA5 
MEiSl 
CA12 

Table 3. McdCene Identified a set of relatively understudied, yet highly 
expressed genes In £R negative, but not ER positive breast tumors. AU of 
these genes have either never been co-dted with breast cancer or have a 
weak association except those marked wlih an *. 



reflects the many genes whose role In breast cancer may not 
involve large changes in expression in sporadic tumors (e.g., 
BRCAl and BRCA2) and genes whose modest changes In 
expression may be unrelated to the disease. Strikingly, among 
genes with a 10-fold change or more in expression level, there 
was a strong and significant correlation between expression 
level and a published role In the disease, providing the first 
global validation of the micro-array approach to Identifying 
disease-specific genes. 

The results derived from MedGene have two implications. 
First, a careful hunt for corroborating evidence of a role in 
breast cancer should precede any Further study of genes with 
less than 5-fold expression level changes. Second, any genes 
with 10-fold changes or mare are likely to be related to breast 
cancer and warrant attenUon. It is likely that this threshold will 
change depending on the disease as well as the experiment. 

Interestingly* the observed correlation was only found among 
ER-poslUve tumors, not ER-negative. This may reflect a bias 
in the literature to study the more prevalent type of tumor In 
the population. Furthermore, this emphasizes that caution 
must be taken when Interpreting experiments that may contain 
subpopulaUons that behave very differently. The MedCene 
approach identified a set of relatively understudied, yet highly 
. expressed genes In ER-negative tumors that are worthy of 
* further examination (Table 3). 



Hu et al. 

In conclusion, we have developed an automated method of 
summarizing and organizing the vast biomedical literature. To 
our knowledge, the resulting database is the most comprehen- 
sive and accurate of its kind. By generating a score that reflects 
the strength of the association. It provides an important tool 
for the rapid and flexible analysis of large datasets from various 
high'throughput screening experiments. Furthermore, it can 
be used for selecting subsets of genes for functional studies, 
for building disease-specific arrays, for looking at genes com- 
mon to multiple diseases and various other high-throughput 
applications. In the future, it will be possible to enhance the 
utility of the MedCene database by building links between 
genes and other MeSH terms as well as other biological 
processes and concepts, such as cell division and responses to 
small molecules. 
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Supporting Information Available: Twenty-three 
human disease category headings along with all of their child 
terms selected from the 2002 MeSH index (Supplemental Table 
1); analysis of the causes of false negatives in MedGene 
(Supplemental Table); meaningful gene-disease relationships 
found In MedGene (Supplemental Table 3): causes for incorrect 
assignment of gene indexes (Supplemental Table 4); a review, 
of the results, showing that the resulting disease clusters were 
indeed logical (Supplemental Figure 1); and a review of the 
results showing that among the 505 previously unrelated genes, 
467 were either newly identiOed genes or genes that had not 
previously been associated with any disease (Supplemental 
Figure 2). This material is available free of charge via the 
Internet at http://pubs.acs.org and at the web sites mentioned 
in the text. 

References 

(1) Baaslri, R. A.: Glasser. S. Steffen. D. U Wheeler. D. A. Oncogene 
1999. 18, 7958-7965. 

(2) Steffen, D. L; Levirte. A. E.; Yarus, S.; Baasiri, It A.; Wheeler. D. 
A.BioinformatlctZQm. i6.639~$49. 

(3) Marcotte, E. M.; Xenarlos. I.: Elsenberg. D. Blolnformatlcs 2001. 
17.359-363. m n t . 

(4) Ono, T.: HishigakL H.: Tanigami. A.: TakagL T. BtoinformaUcs 
2001. 17. 155-161. 

(5) Jenssen. T. K.; Laegreld, A.; Komorowskl, J.: Hovig. E. Nat. Genet 
2001. 28. 21-2$. 

(6) Chaussabel D.; Sher, A. Genome Biol 2002. 5. RESEARCH0055. 

(7) Cibbons. F. Roth. F. P. Genome Res. 2002. JZ. 1574-1581, 

(8) PefW-lratxeta. C.: Bork. P.; Andrade, M. A. Not Genet. 2002, 31. 
316-319. 

(9) Funk. M. E.; Raid. C. A. Bull. Med Ubr. Assoc. 1983. 71. 176-183. 
(10) Humphrey, $. M.;MiUer. H. EJ. Am.Soc Inf. Scl. 1987.3*. IB4-196, 

11) Maglott. O. K.; Katz, K. S.; Slcotte. H.; Prukt. K. D. Nucleic Acids 
Res. 2000. 28. 126-128. 

(12) PruitL K. D.; Maglott, D. R. Nucleic Acids Res. 2001, 2$, 137-140. 

(13) Wadelius, M; Andereson. A. Johansson. J. £.; Wadebus, C; 
Rane. E. Pharmacogenetics 1999. 9. 333—340. 

(14) Adam, R. M; Borer, J. C; Williams, j.; Eastham. J. A.; LoughUn, 
K. R.; Freeman, M. R. Endocrinology 1099, 140, 5866-5B75. 

(15) Montorl, V. M.; Smleja. M.; Guyatt. C. H. Mayo Clin. Proc 2000, 
75. 1284-1288. n , , b . 

(16) Denenberg. V. II. Statistics Experimental Design for Behavioral 
and Biological Researchers: Wiley-Uss: New York. 1976. 

(j 7) Rebhan, M.: Chalifa-Caspl, V.; PrUusky. J.: Lancet, D. Trends Genet 
1997, /J, 163. 

(18) Balroch. A.; Apweiler. ft. Nucleic Adds Res. 2000, 28. 45-48. 
PR0340227 



412 Journal of Proteome Research . Vol 2. No, 4, 2003 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

d) BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 
eTblurred OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

jzf LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



